Skip to main content.
home | support | download

Back to List Archive

Re: swish-e on a large scale

From: Bill Moseley <moseley(at)not-real.hank.org>
Date: Thu Sep 30 2004 - 18:48:59 GMT
On Thu, Sep 30, 2004 at 11:40:17AM -0700, Peter Karman wrote:
> 
> 
> Aaron Levitt wrote on 09/30/2004 01:17 PM:
> 
> > Last but not least... the results of the indexer's first run:
> > 
> > 475,944 unique words indexed.
> > 5 properties sorted.
> > 637,449 files indexed.  2,932,324,538 total bytes.  231,714,672 total 
> > words.
> > Elapsed time: 47:57:02 CPU time: 04:30:30
> > Indexing done!
> > 
> 
> 
> One thing you might consider is judicious use of StopWords. That will 
> help keep your indexes much smaller, though it won't necessarily speed 
> up the indexing time.

Almost be better to use a dictionary to limit what words are in the
index.  But that makes it hard to search for specific things.






-- 
Bill Moseley
moseley@hank.org

Unsubscribe from or help with the swish-e list: 
   http://swish-e.org/Discussion/

Help with Swish-e:
   http://swish-e.org/current/docs
   swish-e@sunsite.berkeley.edu
Received on Thu Sep 30 11:49:07 2004