Skip to main content.
home | support | download

Back to List Archive

multi language and stemming take 2

From: <bmiele(at)not-real.auroraquanta.com>
Date: Thu Mar 27 2003 - 19:58:59 GMT
Hi,

Well i managed to procrastinate this project for 4 months!

I am getting ready to begin again and thought i would check to see if
anyone has any new info on the multi language front. I have included my
previous mail for all who don't wait on my every missive :)

Brad
------------------------------------------------------------
 Brad Miele
 Chief Technology Officer
 Aurora & Quanta Productions
 bmiele@auroraquanta.com
 (207)828-8787 x110

"I got a postcard from a strange cloud
and at the bottom she signed:
'all of us are aliens somewhere'" --Geggy Tah

---------- Forwarded message ----------
Date: Fri, 25 Oct 2002 07:52:54 -0400 (EDT)
From: Brad Miele <bmiele@auroraquanta.com>
To: swish-e@sunsite.berkeley.edu
Subject: multi language and stemming

Hi,

I have been using swish-e with great success for some time now. Currently
we are using the prog method and XML to index our database of 80,000 image
records, and the indexing and searching are fast and consistent.

We are about to begin our first spanish language site, with german to
follow, and I am wondering if anyone has experience with using alternate
stemming options.

I have found snowball, http://snowball.tartarus.org/ and my current plan
is to take my incoming index data and pre stem it in the prog portion of
the indexing, then put it in a stem_metaname field for indexing.

My question is really whether this s the best way to go about it. Has
anyone come across, or built multi language stemmers that can replace the
existing swish stemmer? Any experiencial information would be appreciated.

Brad
------------------------------------------------------------
 Brad Miele
 Chief Technology Officer
 Aurora & Quanta Productions
 bmiele@auroraquanta.com
 (207)828-8787 x110

FreeBSD -- because rebooting is for adding new hardware!
Received on Thu Mar 27 20:02:47 2003