Skip to main content.
home | support | download

Back to List Archive

Re: advantages and disadvantages of indexing via the spider

From: Aaron Bazar <aaronb(at)>
Date: Mon Feb 16 2004 - 19:21:03 GMT
I also find the "callback" functionality to be particularly useful in
the script. I use it to specifically ignore certain links on
the remote server and only download what I want. It is really quite

Aaron Bazar

-----Original Message-----
[]On Behalf Of Greg Fenton
Sent: Monday, February 16, 2004 2:06 PM
To: Multiple recipients of list
Subject: [SWISH-E] Re: advantages and disadvantages of indexing via the

--- Eric Lease Morgan <> wrote:
> What are the advantages and disadvantages of indexing via the the
> spider?

Since you are talking about a "remote site", then as you said you
either have to use or some other crawler to get the pages.

Ignoring the features of one crawler over another, the upside of is the lower disk requirements and the guarantee of "fresh"
data.  The downside is, in the event of needing to rebuild the
database, indexing will be slower than indexing a pre-crawled local
disk cache.

We use for our *local* site because we have dynamic content
(e.g. Server Side Includes), so filesystem crawls wouldn't be accurate
or would involve more coding on our part.  Since we have an internal
staging server, we don't impact the production site should we need to
rebuild the database a few times a day.

Hope this helps,

Greg Fenton

Do you Yahoo!?
Yahoo! Finance: Get your refund fast by filing online.
Received on Mon Feb 16 11:21:03 2004