Skip to main content.
home | support | download

Back to List Archive

Re: Strange Conversion error

From: Bill Moseley <moseley(at)not-real.hank.org>
Date: Wed Oct 06 2004 - 16:15:56 GMT
On Wed, Oct 06, 2004 at 07:28:55AM -0700, David Nickel wrote:
> Every time I index my site I get the follow conversion errors.
> 
> input conversion failed due to input error
> Bytes: 0xA0 0x20 0x3C 0x46
> 
> Any help or suggestions would be much appreciated.

How about:

    http://google.com/

I find pasting the quoted error message into google normally works.
At least, that's what I tried.

Seems like libiconv is reporting to libxml2 an error on conversion.
iconv returned the error EILSEQ and then libxml reported that as the
above error.

I didn't look at all the code in libxml2 -- it seems (from the
comments in encoding.c) like libxml2 is trying to convert a latin-1
stream to utf8.  That doesn't sound right to me, as I'd think there
would not be an invalid latin-1 sequence possible (are there invalid
codepoints for latin-1?).  So maybe the comments in encoding.c don't
apply.  Seems more likely 

So, my guess is you have an input file with invalid utf8 characters
sequence.

If you have iconv (the program) on your machine you might try
converting your document and see if you see a similar error message.

Unfortunately, libxml2 doesn't really offer any help on what file
caused the problem.  You might need to index with a large -v setting
to get an idea of what file caused the problem.


-- 
Bill Moseley
moseley@hank.org

Unsubscribe from or help with the swish-e list: 
   http://swish-e.org/Discussion/

Help with Swish-e:
   http://swish-e.org/current/docs
   swish-e@sunsite.berkeley.edu
Received on Wed Oct 6 09:16:09 2004