At 10:27 AM 08/02/01 -0700, Rick McGowan wrote:
>But the indexing operation doesn't seem to ever complete... It takes forever.
>
>I have 20,000 small files of about 1k bytes each; e-mail archives. On an
>800MHz Pentium III with 128MB of main memory (recent Linux OS)... is it
>reasonable for the initial indexing of this dataset to take over 48 hours?
No that's not reasonable. What exact version are you running? I would
expect that to index in just a few minutes in 2.1-dev. I can index 30,000
/usr/doc (much larger) files in about 15 minutes on my machine with 128M --
and that's swapping.
In general, memory is the real problem with swish. 128M is not that much.
But should be enough for your situation.
2.1-dev might be good for you, especially if you are indexing mail
archives. You can write a perl script to extract out the From:, To:,
Subject:, Date: and feed that data to swish for indexing. Then you can
limit searches to those fields.
For fun, try indexing again with 2.1-dev.
http://sunsite.berekely.edu:4444/swish-daily/
Bill Moseley
mailto:moseley@hank.org
Received on Thu Aug 2 21:51:32 2001