Skip to main content.
home | support | download

Back to List Archive

Re: new fuction

From: Bill Moseley <moseley(at)not-real.hank.org>
Date: Tue Jul 05 2005 - 18:05:34 GMT
On Tue, Jul 05, 2005 at 12:47:04AM -0700, ??????????? ?????? wrote:
> I want propose add new parameter in config file:
> min_words_in_file 1

Swish-e doesn't really know how many words are in a file until after
they have been indexed.  So each document would either need to be
parsed twice, or indexing redesigned to parse and store words before
indexing, or to have a way to "un-index" all the words.

There's actually code to do the later -- it's used to reject a
document based on its title.

Is files size not a good enough indication of "too small"?

-- 
Bill Moseley
moseley@hank.org

Unsubscribe from or help with the swish-e list: 
   http://swish-e.org/Discussion/

Help with Swish-e:
   http://swish-e.org/current/docs
   swish-e@sunsite.berkeley.edu
Received on Tue Jul 5 11:05:34 2005