Skip to main content.
home | support | download

Back to List Archive

Re: a better way to index?

From: Bill Moseley <moseley(at)not-real.hank.org>
Date: Mon Mar 10 2003 - 19:55:41 GMT
On Mon, 10 Mar 2003, Hup Chen wrote:

> > > # $parse $source | $swishe -i stdin -S prog -f $pathOut/$test1 -c
> > > swish.conf
> >
> > What was the max memory usage?
> 
>   The external parse program was using 2GB, swish-e took about 750MB RAM.

What is $parse doing to use so much memory?

>   under -e mode, swish-e used only up to 150MB, but the external program
> was still using 2GB.  The index time was about the same, 3:15.
> 
> Elapsed time: 03:15:27 CPU time: 00:-4:-3

Hum, that CPU time is a bit werid.  Maybe a bug in the display -- or maybe
swish is so optimized that for a lot of records it actually gives CPU time
back to the system! ;)

I once had a program (perhaps in the list archives) that read
/usr/share/dict/words and built random 2000 word documents and fed them to
swish.  The program printed the bytes/second every 1000 documents or so,
and it was clear that indexing slowed down without -e but with -e started
out slower but didn't really change much.  So I'd expect some point -e
would be faster.  

Sure -e will be faster if without -e you run out of memory...


-- 
Bill Moseley moseley@hank.org
Received on Mon Mar 10 20:01:09 2003