Skip to main content.
home | support | download

Back to List Archive

incremental indexing and spidering

From: Brandon Shalton <brandon(at)>
Date: Tue Oct 10 2006 - 16:55:33 GMT

I am long time user of swish, and greatly encouraged by the incremental 
indexing option.

I do alot of spidering to map how websites link to each  other (database is 
over 1B records), and i want to keyword index as well.

ideally i would like to do: -c config.file | swish-e -S prog -i 

where the idea is to not spider to disk the mirror copy, but to be able to 
directly pump into swish with the incremental index, such that i could have 
200 of these command lines running, indexing to their indidividual 200 .idx 

at the end of the day, i would merge the 200 .idx files into 1 daily index 

i tried a fews ago to use the experimental incremental indexing, but i 
couldn't get it to all work as described above.

any pointers would be greatly appreciated.

Received on Tue Oct 10 09:56:38 2006