>
> > But I've got some problems searching words with german umlauts in the
> swish
> > database. The problems also occurs when searching for words (with
> umlauts)
> > in simple html pages.
> We use the C3-API to 'solve' this by first normalyzing any
> unicode (utf8, utf7). Then we use it again to convert everything into
> 7 bit ascii using a look-like conversion alsy part of the C3 api. So
> things like the 'u-umloud' become an 'u' (rather than the sound like
> conversion which gives you an 'u' and 'eu').
>
> This the text we index.
>
> We do the same magic to the search string. Though not very
> beautifull, it does kind of work :-)
Hi!
Yes, that would work as workaround... ;-)
But IMO a workaround doesn't solve a problem...
Does anyone know how special characters (Umlauts, other language special
chars) are stored within the index file? I haven't looked in the swish-e
source yet.
Is it possible to use a common (mapped) charset to fix this problem?
e.g.: mapping ü to ISO-8859-1 - characters.
ciao Rainer
Received on Tue Aug 11 05:38:49 1998