Skip to main content.
home | support | download

Back to List Archive

Re: some 2.4.3 -> 2.4.4 weirdness

From: Bill Moseley <moseley(at)not-real.hank.org>
Date: Wed Nov 08 2006 - 20:25:07 GMT
On Wed, Nov 08, 2006 at 02:03:56PM -0500, brad miele wrote:
> ok, so just to make sure i do this correctly, i will build two indexes 
> with 2.4.4, the smaller test, and the full.
> 
> then, i will do -T INDEX_WORDS > a file for each and look at how the 
> phrase (Corey Rich) is represented?

Not exactly.  While indexing I'd use -T indexed_words -- that would
show the what words are being placed into the index as you are
indexing.  Then you would be able to say, yes those words are being
parsed and added to the index.

Then after indexing use -T index_words to make sure they actually got
into the index and are indexed under the correct metanames, etc.

At that point if searching doesn't work then we know it's a problem
with how the search code is accessing the index to find the words in
question.


> also, something in the back of my head is screaming "stemming" at me, but 
> i am not sure why... maybe just a migraine.

Possible.  -T indexed_words and -T index_words would show you the
stemmed words (-T parsed_words, iirc, would show you the pre-stemmed
words).

Then when searching use -H9 to show you what swish is searching for --
that is, that it's searching using the stemmed words.

You have more experience with your indexes than I do, but my general
feeling is stemming and stop words are not always such a great thing.

-- 
Bill Moseley
moseley@hank.org

Unsubscribe from or help with the swish-e list: 
   http://swish-e.org/Discussion/

Help with Swish-e:
   http://swish-e.org/current/docs
   swish-e@sunsite.berkeley.edu
Received on Wed Nov 8 12:25:12 2006