Skip to main content.
home | support | download

Back to List Archive

RE: word fragment stemming

From: McKenzie, Chuck <Chuck.McKenzie(at)not-real.pleasantco.com>
Date: Thu Mar 20 2003 - 18:32:55 GMT
Yes, this works perfectly.

All this time I thought swish didn't have wildcard matching, and it turned
out that it was just my paranoid CGI filtering out all nonalphanumeric input
characters, so the '*' couldn't get through.

D'oh.

Wow I feel dumb now.  Thanks ;)

-----Original Message-----
From: john cooper [mailto:john@wrenhill.com]
Sent: Thursday, March 20, 2003 12:19 PM
To: Chuck.McKenzie@Pleasantco.com
Cc: Multiple recipients of list
Subject: Re: [SWISH-E] word fragment stemming


Hi,

Could you not just alter each search word to end in a 'wildcard' symbol, 
before searching, so that "hosp app"
would actually search for "hosp* app*"?

If they do use full words it will still work.

Phrases and reserve words would need handling, but a little perl should 
do it!

Cheers

John


McKenzie, Chuck wrote:

>Is there an existing way to have swish-e match a search for the beginning
of
>a word to the full word?  As an example, matching "hosp" to "hospital"?  I
>haven't found a way to do it with existing Stemming or FuzzyMatching
>options.
>
>Would the best way to do this be to use a custom indexing program that for
>each word, added an index entry for substr($word,0,3) through
>substr($word,0,length($word))?
>
>In context, I'm looking to replace our existing Netscape Enterprise Server
>search engine with apache/swish-e, but many people have been trained to use
>the existing search engine with only word fragments, and I need to avoid
>having to retrain them.
>
>Thanks,
>Chuck McKenzie
>
>
>
>*********************************************************************
>Due to deletion of content types excluded from this list by policy,
>this multipart message was reduced to a single part, and from there
>to a plain text message.
>*********************************************************************
>
>  
>



*********************************************************************
Due to deletion of content types excluded from this list by policy,
this multipart message was reduced to a single part, and from there
to a plain text message.
*********************************************************************
Received on Thu Mar 20 18:36:41 2003