Skip to main content.
home | support | download

Back to List Archive

Re: [swish-e] err: External program failed to return required headers Path-Name (Swish-e 2.4.5)

From: <Rene.Kloos(at)>
Date: Thu Mar 29 2007 - 13:17:05 GMT
Hello Clint,

Although Bill Moseley can give you the most accurate answer, I can already
say that I stumbled into the same problem at one point. In your setup the
spider creates an output file which is used as an input file for Swish-e to
index. This file requires several headers to be present for every spidered
page, e.g. path name and content-length. The content-length value is taken by
Swish-e to read in the next <content-length> characters. The fact that your
warning contains 'h-Name' shows that Swish-e reads in 3 characters too many,
i.e. 'Pat', so Swish-e doesn't find the next 'Path-Name' header where it
expects to find it. This means that the value listed in the content-length
header is not in accordance with the actual content-length.

I guess this has to do with the UTF-8/Latin-1 issue when using libxml2, but I
am certainly no expert in that area :-)

In one of the posts it is suggested to modify the

my $bytecount = length pack ‘c0a*’, $$content;

should become:

my $bytecount =  do { use bytes; length( $$content) };

This did actually NOT do the trick for me. The following DID:

my $bytecount = length($$content);

I have been happy indexing ever since (static pages, not dynamic ones).

Hope this helps,
René wrote on 29/03/2007 12:09:32:

> SWISH-E 2.4.5
> Linux 2.6.9-42.0.8.ELsmp #1 SMP Tue Jan 23 13:01:26 EST 2007 i686 i686
> i386 GNU/Linux
> I initially indexed only static pages, which worked fine. However it has
> become necessary to index the database driven pages as well.
> I setup and got as far as having it generate the output.
> txt file which is
> around 40MB+, using  /usr/local/lib/swish-e/ default http:
> // > output.txt
> No errors were reported.
> But now when I run
> swish-e -c config -S prog -i stdin < output.txt
> I get this fatal error soon after
> Warning: Unknown header line: 'h-Name:'
> from program
> err: External program failed to return required headers Path-Name:.
> I have looked up this error, but the posts are from 2003-2005 and
> although explain
> possible reasons why this is happening, don't really show how to
> fix, or workaround this error.
> I'm only indexing html text files and text from dynamic pages, not
> images, pdfs or anything like that.
> How does one fix this?
> Regards
> Clint
> _______________________________________________
> Users mailing list
Users mailing list
Received on Thu Mar 29 09:17:25 2007