Skip to main content.
home | support | download

Back to List Archive

RE: Incremental updating

From: <adivey1(at)not-real.cox.net>
Date: Thu Jun 10 2004 - 14:35:44 GMT
Great idea, thanks! Also, thanks Bill. I used the advice from both of you to make it work. I'm using -e (oops) and making an index for each site and then merging them together. Wrote me a nice perl script to do it :)
> 
> From: "Aaron Bazar" <aaronb@spamcop.net>
> Date: 2004/06/08 Tue PM 08:16:05 EDT
> To: Multiple recipients of list <swish-e@sunsite3.berkeley.edu>
> Subject: [SWISH-E] RE: Incremental updating
> 
> I use the merge option quite frequently with the spider option to index many
> domains. It is slower than the file system, but you can launch several
> spiders at once, make several indexes at once, and then merge them. It is
> also nice because all the URLS are correct to different domains without
> having to do a lot of different substitutions in the URLs and paths. Also,
> you can kill the spider when you think you have indexed enough pages of a
> particular site and the swish-e engine will finish up the index. Also, if
> something goes wrong while spidering/indexing, only a small part of your
> entire work is stopped...
> 
> 
> Regards,
> 
> Aaron Bazar
> http://www.cdnsport.ca
> 
> -----Original Message-----
> From: swish-e@sunsite3.berkeley.edu
> [mailto:swish-e@sunsite3.berkeley.edu]On Behalf Of adivey1@cox.net
> Sent: Tuesday, June 08, 2004 2:48 PM
> To: Multiple recipients of list
> Subject: [SWISH-E] Incremental updating
> 
> 
> Is it possible to do some kind of incremental update (via spider or file
> system)? If not, how often would you reccommend indexing a large (1,000,000+
> hits a month) site?
> 
> Also, this is a problem for me doing a large number of sites at once. We're
> trying to develop a search engine for our entire program which has around
> 100 domains. I started it last night and when I got in in the morning, it
> had frozen, swish-e was using 630MB of RAM, and the computer was barely
> usable. I'm using a 500Mhz P3, so it's not unexpected for something like
> this so happen, but if I could do 5-10 a night, and then incrementally add
> more to the index, it'd be great.
> 
> Thanks!
> 
> 
> 
Received on Thu Jun 10 14:35:46 2004