I experienced this problem when installing SWISH-s and resolved it by using
the -e option and pointed the Temp-Dir to a drive that has lots of space
(almost 100GB). Now it appears to be an issue again. I've also tried
indexing by in a command window using my admin account and receive the same
error. We use the following command to execute the index:
D:\ProgramFiles\Swish-E\swish-e -S http -e -c
D:\ProgramFiles\Swish-E\conf\siteindex.config. The config file looks like
this:
# ----- SiteIndex.config - Spider using "http" method -------
#
# Please see the swish-e documentation for
# information on configuration directives.
# Documentation is included with the swish-e
# distribution, and also can be found on-line
# at http://swish-e.org
#
#
# This example demonstrates how to use the
# the "http" method of spidering.
#
# Indexing (spidering) is started with the following
# command issued from the "d:\Program Files\Swish-e" directory:
#
# swish-e -S http -c Siteindex.config
#
# Note: You should have the current Bundle::LWP bundle
# of perl modules installed. This was tested with:
# libwww-perl-5.53
#
# ** Do not spider a web server without permission **
#
#---------------------------------------------------
# Include our site-wide configuration settings:
IncludeConfigFile D:/ProgramFiles/Swish-E/conf/Settings.config
# Specify the URL (or URLs) to index:
IndexDir http://www.hr.msu.edu/hrsite
# If a server goes by more than one name you can use this directive:
# EquivalentServer http://swish-e.org http://www.swish-e.org
# This defines how many links the spider should
# follow before stopping. A value of 0 configures the spider to
# traverse all links. The default is 5
# The idea is to limit spidering, but seems of questionable use
# since depth may not be related to anything useful.
MaxDepth 10
# The number of seconds to wait between issuing
# requests to a server. The default is 60 seconds.
Delay 1
# Skip pages with Meta tag "noindex"
obeyRobotsNoIndex yes
# (default /var/tmp) The location of a writeable temp directory
# on your system. The HTTP access method tells the Perl helper to place
# its files there. The default is defined in src/config.h and depends on
# the current OS.
TmpDir D:/Inetpub/Indexes/Temp
# The "http" method uses a perl helper program to fetch each document
# from the web called "swishspider" and is included in the src directory of
# the swish-e distribution.
SpiderDirectory D:/ProgramFiles/Swish-E
# Put the index files in the Inetpub/Indexes directory
IndexFile D:/Inetpub/Indexes/SiteIndex.New.index
# end of SiteIndex Config file
I am receiving the following warning in my log files from the indexing job:
Warning: Configuration setting for TmpDir 'D:/Inetpub/Indexes/Temp' will be
overridden by environment setting 'C:\DOCUME~1\rek\LOCALS~1\Temp' which does
not exist. When I look in the specified temp directory I've found SWISH-e
work files so I'm not sure if this is a problem or not.
The summaries of the last good index on 9/8 look like:
1468 files indexed. 39839610 total bytes. 810188 total words.
Elapsed time: 00:32:05 CPU time: 00:32:05
Indexing done!
We are using the latest windows version of Swish-e on a Windows 2000 server.
The archives and FAQ point to the -e option to fix memory issues. What have
I missed?
Rick
Richard Klingensmith
MSU Human Resources Information Systems
1407 S. Harrison Road Ste. 40
East Lansing, MI 48823
(517) 432-4636 ext. 155
klingensmith@hr.msu.edu
*********************************************************************
Due to deletion of content types excluded from this list by policy,
this multipart message was reduced to a single part, and from there
to a plain text message.
*********************************************************************
Received on Thu Sep 11 15:33:46 2003