On Tue, Oct 04, 2005 at 01:11:16PM -0700, Antonio Barrera wrote:
> I am using Stemming_en, a search for "Environmental" includes the 6.xml in
> the results, but not 398.xml. A search for environment, returns 398.xml,
> but not 6.xml. In the live version, Environment returns 22 hits,
> Environmental 30. Shouldn't stemming result in the same number of hits?
Depends on the words and how the stemmer works.
moseley@bumby:~$ cat c
FuzzyIndexingMode stemming_en
moseley@bumby:~$ cat words
Environment
Environmental
moseley@bumby:~$ swish-e -T indexed_words -c c -i words -v0
Adding:[1:swishdefault(1)] 'environ' Pos:5 Stuct:0x9 ( BODY FILE )
Adding:[1:swishdefault(1)] 'environment' Pos:6 Stuct:0x9 ( BODY FILE )
So the stemmer considers those to be different.
--
Bill Moseley
moseley@hank.org
Unsubscribe from or help with the swish-e list:
http://swish-e.org/Discussion/
Help with Swish-e:
http://swish-e.org/current/docs
swish-e@sunsite.berkeley.edu
Received on Tue Oct 4 13:40:24 2005