I believe a search on the discussion archives will tell you that the
UTF-8 (and other Unicode sets) would require a significant recoding of
swish-e. So far, no one has stepped forward to do that.
In your case, if there are UTF chars that have direct iso8859
equivalents, you might play with the WordCharacters and
TranslateCharacters config settings. That way things in the 8859 range
of > 128 might work for you.
Please, someone with better encoding knowhow than me, correct this if it
is wrong.
Mammitzsch.T@zdf.de wrote on 8/4/04 6:51 AM:
> Hi everybody,
>
> i try to spider an IIS 6.0 which delivers pages with utf-8 in the
> http-header. As far as i understood the manual, swish-e converts utf-8 to
> iso-8859-1 if i use libxml2 (html2-parser). Unfortunately special chars like
> german umlauts are not recognized if i search through the swish.cgi
> frontend. Also results with umlauts are not displayed correctly. swish-e
> runs on a sun e450 with solaris 5.8. Any ideas?
>
> best regards,
>
> _______________________________________
>
> Thomas Mammitzsch
--
Peter Karman - Software Publications Engineer - Cray Inc
phone: 651-605-9009 - mailto:karman@cray.com
Received on Wed Aug 4 06:52:59 2004