Skip to main content.
home | support | download

Back to List Archive

Re: Shrinking swish-e memory footpring

From: Bill Moseley <moseley(at)not-real.hank.org>
Date: Wed Sep 26 2001 - 21:33:05 GMT
At 02:13 PM 09/26/01 -0700, Michael wrote:
>It's not really practicle for me to try the dev code until merge is 
>working. On 2.0.5, I stopped a full index ~ 97k files when the memory 
>requirements grew beyond 600 megbytes -- about a day and a half of 
>indexing. I was able to index and merge the data incrementally. The 
>last merge involved a 66meg index and 1.5meg index. The memory 
>footpring was just shy of 600megs. This seems a bit inefficient???

Well, that was my point.  The dev code is much better with memory usage.
There's still limitations, but it's a world of difference from 2.0.5.  I
only have about 25,000 html files on my machine to test with (which takes
about 5 minutes and 80M), so I'd be curious to see what happens with larger
file sets.

Not to mention it would be nice to get some testing done on the code before
it's released.  And, to get some testing with libxml2, which swish-e can
now use. (advantage is it's a push parser for XML and HTML so it should use
less memory for large files, and is probably much better that the buggy
html.c parser)

>The merge was done incrementally from 47 index files that are each 
>about 1.5megs. The merge of the first two created a 50+meg memory 
>footpring and grew from there to 600+megs for the final merge. I 
>tried it in one fell swoop, but it did not appear to being doing well 
>after about a half day so I stopped it and used the incremental 
>approach.

That sounds like a major pain.



Bill Moseley
mailto:moseley@hank.org
Received on Wed Sep 26 21:33:43 2001