Skip to main content.
home | support | download

Back to List Archive

Re: Installed on Red Hat Linux Enterprise AS 3 and get

From: Bill Moseley <moseley(at)>
Date: Wed Jun 23 2004 - 16:30:40 GMT
On Wed, Jun 23, 2004 at 08:57:54AM -0700, John Kelley wrote:
> I installed both swish-e2.2.3 and 2.4.2 and get the following (this is from
> 2.4.2):
> Indexing Data Source: "External-Program"
> Indexing "./"
> External Program found: ./
> - Using HTML2 parser -  (11565
> words)
> Warning: Unknown header line: 'elim/TraditionalFarewell.htm'>Traditional
> Farewells</a><br>' from program ./

Can't tell you for sure.  My guess is some type of encoding problem --
perhaps multi-byte characters.

It's somewhat easy to figure out, if you have a good editor.


    ./ > out.file

then look at out.file.  The format looks a lot like an HTTP or email
message.  There's a few header lines, then two newlines then the
content.  One of the header lines is the content-length.  And what's
likely happening is the value of the content-length is not really the
length of the actual content for some reason.

In the past what has happened is the the content-length was reported as
the number of *characters* by, but swish-e expects that to be
the number of *bytes* -- which will be different if you have any
multi-byte characters. was patched to report *bytes*, so it's been a while since
anyone has reported such a problem.  So, I could be wrong and your
problem might be something else.

You might play with your LANG environment variable.  Google will find
lots of reports about Redhat users and UTF-8 encoding problems.  I have
LANG=en_US on my machine.

Anyway, use of a good editor will allow you to see what's happening.

Bill Moseley

Unsubscribe from or help with the swish-e list:

Help with Swish-e:
Received on Wed Jun 23 16:30:42 2004