Skip to main content.
home | support | download

Back to List Archive

Re: fix for my stemmer_en2 issue

From: brad miele <bmiele(at)not-real.ipnstock.com>
Date: Fri Nov 10 2006 - 21:22:56 GMT
yes, it seems that the volume of files/words was a factor, since it 
didn't/doesn't crop up with smaller sets.

this test was on the full set, so i am sort of baffled by why that change 
would make the difference.

i guess i should keep looking for a more real solution. the stemmer_en1 
doesn't seem to do as good of a job (at least according to our 
salespeople), and we can't seem to make the jump to 2.4.4 with en2

thanks again for your time Bill,

Brad
---------------------
Brad Miele
VP Technology
IPNStock.com
866 476 7862 x902
bmiele@ipnstock.com

On Fri, 10 Nov 2006, Bill Moseley wrote:

> On Fri, Nov 10, 2006 at 12:56:46PM -0800, brad miele wrote:
>> hi,
>>
>> I was messing around trying to figure out what was making the Stemmer_EN2
>> stuff not work in 2.4.4. i noticed that stemmer.c in 2.4.4 was different
>> from the one in 2.4.3. notably, in the section starting at line 117, under
>> "static FUZZY_OPTS fuzzy_opts[] = {" i find that when i remove the two
>> references at the top to:
>>
>>      { FUZZY_STEMMING_EN2,       "Stemming_en",      Stem_snowball,
>> porter_create_env, porter_close_env, porter_stem },
>>      { FUZZY_STEMMING_EN2,       "Stem",             Stem_snowball,
>> porter_create_env, porter_close_env, porter_stem },
>
> That's just a mapping table -- it maps the config names ("None",
> "Stemming_en", etc.) to the code for that stemmer.
>
> The difference between 2.4.3 and 2.4.4 is that we removed the old
> Porter stemmer so Stem and Stemming_en were changed to use the new
> snowball stemmer code instead of the old Porter code.
>
> But, your problem was due to the number of files/words indexed, right?
>
> -- 
> Bill Moseley
> moseley@hank.org
>
> Unsubscribe from or help with the swish-e list:
>   http://swish-e.org/Discussion/
>
> Help with Swish-e:
>   http://swish-e.org/current/docs
>   swish-e@sunsite.berkeley.edu
>
>
>
Received on Fri Nov 10 13:22:57 2006