yes, it seems that the volume of files/words was a factor, since it
didn't/doesn't crop up with smaller sets.
this test was on the full set, so i am sort of baffled by why that change
would make the difference.
i guess i should keep looking for a more real solution. the stemmer_en1
doesn't seem to do as good of a job (at least according to our
salespeople), and we can't seem to make the jump to 2.4.4 with en2
thanks again for your time Bill,
Brad
---------------------
Brad Miele
VP Technology
IPNStock.com
866 476 7862 x902
bmiele@ipnstock.com
On Fri, 10 Nov 2006, Bill Moseley wrote:
> On Fri, Nov 10, 2006 at 12:56:46PM -0800, brad miele wrote:
>> hi,
>>
>> I was messing around trying to figure out what was making the Stemmer_EN2
>> stuff not work in 2.4.4. i noticed that stemmer.c in 2.4.4 was different
>> from the one in 2.4.3. notably, in the section starting at line 117, under
>> "static FUZZY_OPTS fuzzy_opts[] = {" i find that when i remove the two
>> references at the top to:
>>
>> { FUZZY_STEMMING_EN2, "Stemming_en", Stem_snowball,
>> porter_create_env, porter_close_env, porter_stem },
>> { FUZZY_STEMMING_EN2, "Stem", Stem_snowball,
>> porter_create_env, porter_close_env, porter_stem },
>
> That's just a mapping table -- it maps the config names ("None",
> "Stemming_en", etc.) to the code for that stemmer.
>
> The difference between 2.4.3 and 2.4.4 is that we removed the old
> Porter stemmer so Stem and Stemming_en were changed to use the new
> snowball stemmer code instead of the old Porter code.
>
> But, your problem was due to the number of files/words indexed, right?
>
> --
> Bill Moseley
> moseley@hank.org
>
> Unsubscribe from or help with the swish-e list:
> http://swish-e.org/Discussion/
>
> Help with Swish-e:
> http://swish-e.org/current/docs
> swish-e@sunsite.berkeley.edu
>
>
>
Received on Fri Nov 10 13:22:57 2006