At 01:17 AM 08/08/02 -0700, Bernhard Weisshuhn wrote:
>> But you can probably create this a number of ways. One easy way is to just
>> create a long string of all your part numbers, with some type of delimiter
>> between them. Then just do a substring search and then extract out the
>> full part number. You could even apply regular expressions in this way to
>> do rather complex "searches". My guess is it would be reasonably fast,
too.
>
>That's how I do it now. I have a preprocessor in perl that creates
>monstrous ORed querys for this. But it's butt ugly and I always feel
>like I'm abusing our wonderfull little swish-e...
Oh, so you might be combining your query for partial part numbers with some
other query? That does make it a bit more trouble.
>In german - and I guess in many other languages too - you have a lot
>of compound words (is that the right term?), something like
>"Stackenblokkenquiz", and people want to find that when only looking for
>'quiz'.
Ok, that makes sense.
>Wouldn't it be possible to to "just" (heh!) index reversed words and then
>also reverse the left-masked words in the query? Somthing like this,
>you get the idea....
Sure, that's a good idea. I thought the advantage of your current way is
you can find substrings -- e.g. wildcard at both ends (or use regular
expressions).
Might be nice to have an option to write an extra wildcard index with the
words reversed, with perhaps a list of metanames that are indexed this way
to fine tune the index size.
For now you might try this: use -S prog to basically filter your docs and
create a new metaname that holds your part number. Then in your queries if
you see a leading "*" invert that word.
So say you have a query like
foo bar *baz
then in your front end replace *baz with
foo bar (reverse_metaname=zab*)
But it would be a lot easier if swish did that for you, but if you need it
now, then there's one way.
It's been a long standing "todo" item to rewrite the query parser in swish.
Right now it's just a brut-force parser. This would be a great project for
someone...
--
Bill Moseley
mailto:moseley@hank.org
Received on Thu Aug 8 14:28:34 2002