On Tue, Jul 05, 2005 at 12:47:04AM -0700, ??????????? ?????? wrote:
> I want propose add new parameter in config file:
> min_words_in_file 1
Swish-e doesn't really know how many words are in a file until after
they have been indexed. So each document would either need to be
parsed twice, or indexing redesigned to parse and store words before
indexing, or to have a way to "un-index" all the words.
There's actually code to do the later -- it's used to reject a
document based on its title.
Is files size not a good enough indication of "too small"?
--
Bill Moseley
moseley@hank.org
Unsubscribe from or help with the swish-e list:
http://swish-e.org/Discussion/
Help with Swish-e:
http://swish-e.org/current/docs
swish-e@sunsite.berkeley.edu
Received on Tue Jul 5 11:05:34 2005