Proximity Searching, Stemming

From: Tac <tac(at)>
Date: Fri Jul 09 2004 - 15:49:07 GMT
First, I gotta say how amazed I am with the speed.  This is at least 10x
faster than our current search technology.
However, our current search engine does a few more things, so I figured I'd
ask about a couple of issues and see if swish-e might already support them
or maybe in Version 3.
Does swish-e support proximity searching, so that you can find words when
they're within a few word of each other?  e.g.  "smoking ban" w/5 airport
would find "airport smoking ban" and "smoking ban in airports".  If so, that
would mean that the word offsets were somehow stored, so the next question
would be: "could we get those word offsets?"  I realize that stemming
happens at indexing, not searching, time, so when a document comes back, we
really don't know what word(s) matched.  This makes highlighting difficult.
My idea is that if we had access to the word offsets, we'd know which words
were matched.
Does this make sense?

