At 10:15 PM 12/05/01 -0800, Benjamin Grosser wrote:
>I'm getting the following message when trying to merge index files:
>
> swish: Ran out of memory (could not allocate enough)!
You may be out of luck.
Maybe Jose will have some comments (or correct anything that I say
incorrectly).
First, merging isn't just a matter of copying two files together. It
basically goes through the entire indexing process again.
Jose and Bill Meier and others put in a lot of work optimizing the indexing
process. I can now index on my machine in about three minutes what once
took five hours ;) (ok, so a machine swapping to death isn't a good
comparison.)
But the optimizations are really focused on file-by-file indexing. Jose
does a lot of compression after processing each file while indexing. None
of that is done while merging. Basically, I think, merging falls back
something like swish 1.3 memory requirements.
I can index about 25K files on my machine in about 70M. If I try to merge
that index with another index with only one file, I run out of memory.
You have two options at this time.
Index everything at once, or, if that's not reasonable, specify multiple
index files to swish with the -f switch.
The disadvantage of using multiple index files, IIRC, is only that sorting
very large result sets is slower, and requires more memory. Sorting with
single indexes uses the "pre-sorted" tables making sorting faster.
There may be some way to use better compression during merge, but I think
it will be a while before that gets any attention.
>However, now it seems as if the merge is dying regardless of how much is
>free. It seems to run right up to using about 530MB of physical RAM and
>then quits. I freed up to about 650MB and it still quits after about
>530MB (hard to tell--I'm watching top so am not getting precise figures).
Could you also be running up against a ulimit setting?
Try running with multiple indexes using -f and see how that works.
Bill Moseley
mailto:moseley@hank.org
Received on Thu Dec 6 07:36:55 2001