Skip to main content.
home | support | download

Back to List Archive

Re: Proximity Searching, Stemming

From: Peter Karman <karman(at)not-real.cray.com>
Date: Fri Jul 09 2004 - 15:56:25 GMT
word positions are stored for use with phrase searching. proximity 
matching (i.e., matching words whose positions are within range X from 
one another) is supported as far as *finding* the docs (since that's a 
simple AND search) but as far as adjusting the rank of the doc 
accordingly, that is not yet supported. We've talked about that feature 
for future ranking improvements.

to see word positions, check out the swish-e -T index_words_full option.



Tac wrote on 7/9/04 10:47 AM:

> First, I gotta say how amazed I am with the speed.  This is at least 10x
> faster than our current search technology.
>  
> However, our current search engine does a few more things, so I figured I'd
> ask about a couple of issues and see if swish-e might already support them
> or maybe in Version 3.
>  
> Does swish-e support proximity searching, so that you can find words when
> they're within a few word of each other?  e.g.  "smoking ban" w/5 airport
> would find "airport smoking ban" and "smoking ban in airports".  If so, that
> would mean that the word offsets were somehow stored, so the next question
> would be: "could we get those word offsets?"  I realize that stemming
> happens at indexing, not searching, time, so when a document comes back, we
> really don't know what word(s) matched.  This makes highlighting difficult.
> My idea is that if we had access to the word offsets, we'd know which words
> were matched.
>  
> Does this make sense?
>  
> Tac
>  
> 
> 
> 
> *********************************************************************
> Due to deletion of content types excluded from this list by policy,
> this multipart message was reduced to a single part, and from there
> to a plain text message.
> *********************************************************************

-- 
Peter Karman - Software Publications Engineer - Cray Inc
phone: 651-605-9009 - mailto:karman@cray.com
Received on Fri Jul 9 08:56:35 2004