Skip to main content.
home | support | download

Back to List Archive

Re: Fw: Re: 8-bit chars

From: John Angel <angel_john(at)not-real.hotmail.com>
Date: Thu Dec 11 2003 - 19:23:03 GMT
> You are free to modify parser.c to use iconv and covert back to
> Windows-1250, as I suggested.  But that won't work for everyone else.

Is it possible to use iconv(charset_of_the_document_being_indexed, utf-8)
instead of UTF8Toisolat1()?


> What tolower does depends on the tolower
> function swish-e was linked with.

setlocale(charset_of_the_document_being_indexed) on-the-fly?


> If the index contains words encoded in the 8859-1 character set (or
> Windows-1250) and someone submits a query in utf-8 with characters that
> don't map to 8859-1 that's a conversion failure.

Ok, then we'll pass two parameters to search script - from_encoding and
to_encoding. We disregard conversion failure, there is nothing we can do
about it.
Received on Thu Dec 11 19:23:08 2003