> I use swish-e to create an index of pdf files each
> night that were created that previous day. I store each
> day's index files in a folder on disk.
You might consider doing something like what I do.
I use SWISH-E for a mailing list that has lots of postings. I rebuild
an index of all postings from Sunday morning through the present every
10 minutes or so (by Saturday afternoon there's about 300 document, 5MB
for the week? It takes less than a minute to do); I rebuild an index of
everything from January 1 through Saturday night once per week (we're
almost halfway through the year and it took just about 10 minutes this
week ... about 30MB, 13k document?); and I rebuild an index of
everything up through the previous December 31 every year. This year it
took most of the day to do 1998-2003 (about 1GB total?), but it's only
once per year so who cares?
That way I'm also only searching three indexes.
Try to think about your problem as: much of the info doesn't change, so
try not to change the index that it's in very much ...
Received on Tue Jun 22 00:36:18 2004