Skip to main content.
home | support | download

Back to List Archive

Re: Incremental Indexing when spidering

From: Bill Moseley <moseley(at)not-real.hank.org>
Date: Thu Jun 07 2001 - 17:17:11 GMT
At 09:56 AM 06/07/01 -0700, Gavin Walker wrote:
>So the question is "Can -S http and -N file be used together?".

Nope.  The swishspider program doesn't add the last modified date to the
index.  The plan was to fix that at some point... a long time ago.

If you spider with -S prog then the last modification date does get added
correctly and then you can use -N.  The spider.pl that's in the development
version can be used for this.  

Look at http://cvs.sourceforge.net/cgi-bin/viewcvs.cgi/swishe/swish-e/conf/
(example 6)  Also, get a current version of the spider.pl program (from
above).

Another person on this list is currently having problems spidering this way
on FreeBSD -- something in their config (meaning perl, LWP, or ??) is
causing the spider to eat up a huge amount of memory, while when I run the
same spider and config on my linux machine it runs fine.)

Depending on how fast your web server can feed you the docs, -S prog should
be faster than -S http, if that's an issue.



Bill Moseley
mailto:moseley@hank.org
Received on Thu Jun 7 17:17:31 2001