I'm trying to use the prog-method, but it seems that my program and
swish have a different idea of what content-length means, and I keep
getting
"External program failed to return required headers.."
I redirected my progs output and it can be looked at in
http://www.valt.helsinki.fi/staff/harmo/in.txt
(or exactly same stuff in ..../in.exe if it easier do download in
binary form in case somebody is willing to test it on unix)
Reading that same file from stdin gives the following:
>type c:\x2\in.txt |swish-e -S prog -i stdin -v 3
Indexing Data Source: "External-Program"
Indexing "stdin"
http://www.valt.helsinki.fi/index.htm - Using DEFAULT (HTML2) parser -
(236 wo ds)
Warning: Unknown header line: ': Arial, Helvetica, sans-serif;' from
program stdin
Warning: Unknown header line: 'line-height: 14pt;' from program stdin
Warning: Unknown header line: '}' from program stdin
err: External program failed to return required headers Path-Name:
.
The process tried to write to a nonexistent pipe.
-----------
The "unknown header lines" are not very near the end of the content
where swish should be looking for the headers.
I am using #10 for line-separator between headers and headers and
data (using #13#10 does not change the behaviour greatly).
I've checked the lengths of text between the headers, and they seem
right (+-1 as I'm not sure about crlf-business, and whether the empty
line separating headers from data is counted or not - but I've tested
it with substracting or adding 1 for content length). Could be that
the error is +-2 or more, but the messages above seems to indicate
that swish thinks I'm not even near the mark (it seems to read about
100 chars more than given by content-length).
I'm using version 2.4.0.
I need a prog of my own as I'm trying to index xml-files, but return
the url of a generated html-file (to get the address of which I need
a special program).
I suspect something win-specific, maybe something to do with
linefeeds.
-Timo
P.S sorry if this comes in duplicate, my first try seemed to fail
Received on Wed Oct 13 21:36:20 2004