Skip to main content.
home | support | download

Back to List Archive

Re: [swish-e] Regarding scalibilty and multithreading in Swish-e

From: Judith Retief <JudithR(at)>
Date: Tue Feb 19 2008 - 09:56:58 GMT
>  How scalable Swish-e is, if we crawl million of pages,  
We use swish-e to index local files, not web sites, so I can't venture any opinion on the crawling bit as such. But what I can say is that the core technologies of indexing and searching scale pretty well - we've got a about 2 million content pages indexed, adding about 10 000 daily, and the searches are fast (sub-second). 
We do play around a bit to speed up the indexing; during the day, as we receive new files to index, we index into a 'daily' index file. A nightly job merges the daily file into a main 'master' index file. Searches are done against all the files. 

Users mailing list
Received on Tue Feb 19 04:57:01 2008