> > Is it possible to use iconv(charset_of_the_document_being_indexed,
utf-8)
> > instead of UTF8Toisolat1()?
>
> You mean convert from libxml2's internal utf-8 back to the encoding of
> the original document? Probably -- I assume there's some way to have
> libxml2 tell you what it was encoding from.
Yes, it would be great.
> But that would not work if you have documents of different encodings.
> The index itself has to be one encoding. That's why I was saying that
> iconv could be used with a configuration setting to say what 8-bit
> encoding to use.
Why it wouldn't work with different encodings? It would work just like as it
was indexed with HTML parser?
> > > What tolower does depends on the tolower
> > > function swish-e was linked with.
> >
> > setlocale(charset_of_the_document_being_indexed) on-the-fly?
>
> Well, you want tolower to work for the encoding that the index is
> encoded in.
Of course. So, is this possible?
Received on Thu Dec 11 20:41:06 2003