Skip to main content.
home | support | download

Back to List Archive

Re: swish memory/perf

From: Timothy Smith <timasmith(at)not-real.hotmail.com>
Date: Tue Feb 04 2003 - 18:26:35 GMT
I commented out the IgnoreLimit line and the indexing completed 
satisfactorily.  Without the stop words the index is around 12.5% the size 
of the documentation 275MB/3GB but that is fine for my needs.

thanks

Tim

>From: Bill Moseley <moseley@hank.org>
>To: Timothy Smith <timasmith@hotmail.com>
>CC: Multiple recipients of list <swish-e@sunsite.berkeley.edu>
>Subject: Re: [SWISH-E] Re: swish memory/perf
>Date: Tue, 4 Feb 2003 05:59:28 -0800 (PST)
>
>On Tue, 4 Feb 2003, Timothy Smith wrote:
>
> > I did it again with -v3 and it has run all night, currently with a size 
>428M
> > and has stopped at:
> >
> >
> > Removing very common words...
> >   Getting IgnoreLimit stopwords:                     links
>
>Sounds like you hit on a bug.
>
> > Perhaps I should comment out having stopwords?
>
>At least do not use IgnoreLimit.  If I used stopwords at all it's
>typically a very short list.
>
>I also see that using IgnoreLimit is much slower:
>
>With IgnoreLimit 20 250:
>
>1347 files indexed.  25832099 total bytes.  2346049 total words.
>Elapsed time: 00:01:11 CPU time: 00:01:11
>
>Without:
>
>1347 files indexed.  25832099 total bytes.  2346049 total words.
>Elapsed time: 00:00:20 CPU time: 00:00:20
>
>
>--
>Bill Moseley moseley@hank.org


_________________________________________________________________
The new MSN 8: advanced junk mail protection and 2 months FREE*  
http://join.msn.com/?page=features/junkmail
Received on Tue Feb 4 18:26:55 2003