Skip to main content.
home | support | download

Back to List Archive

Re: prog, header & Content-Length

From: Bill Moseley <moseley(at)not-real.hank.org>
Date: Fri Oct 05 2001 - 12:13:59 GMT
At 05:01 AM 10/5/2001 -0700, Andrea Baruzzo wrote:
>Hi.
>I tried to use the prog access method using =
>swish-e-2.1-dev-22-2001-10-04 on Solaris 5.8. I write a Java program =
>that writes the content of HTML files to standard output, and headers =
>for that files, too. This is what my Java program outputs to standard =
>output (from first "Path-Name... " to last "...</html>").

Most likely you are not counting your content correctly.  I counted 106
chars not 104 in your first chunk.

> cat z
<html>
<head>
  <title>Document1</title>
</head>
  <body>
      This is the text pippo.
  </body>
</html>
>ls -l z
-rw-r--r--   1 lii      users         106 Oct  5 05:08 z

In perl you say 
$length = length $content;
print "Path-Name: $path\n";
print "Content-Length: $length\n\n";
print $content;  # not print "$content\n";

>
>
>Path-Name: http://somewhere/doc1.html
>Content-Length: 104
>
><html>
><head>
>  <title>Document1</title>
></head>
>  <body>
>      This is the text pippo.
>  </body>
></html>              <<< are you counting that newline in the length?
>Path-Name: http://somewhere/doc2.html
>Content-Length: 102
>
><html>
><head>
>  <title>Document2</title>
></head>
>  <body>
>   This is the text pluto.
>  </body>
></html>
>
>
>
>Swish-e correctly invokes my Java program using a small script I wrote, =
>but it says:
>
>Indexing Data Sources: "External-Program"
>Indexing "my_script_that_invokes_my_Java_app"
>err: External program failed to return required headers Path-Name: & =
>Content-Length:
>.
>
>
>Where is the error?
>
>I need this to index single URLs and not entire web sites without =
>downloading each file...
>
>TIA,
>
>    Alan Felice, Andrea Baruzzo
>
>
>
>
>
>
>*********************************************************************
>Due to deletion of content types excluded from this list by policy,
>this multipart message was reduced to a single part, and from there
>to a plain text message.
>*********************************************************************
>
>

Bill Moseley
mailto:moseley@hank.org
Received on Fri Oct 5 12:14:12 2001