At 03:43 PM 05/01/02 -0700, Hsiao Ketung Contr 61 CS/SCBN wrote:
>I've been trying to get swish-e HTTP crawler working for the last 2 days.
>The HTTP crawler works if the IndexDir is set to a URL on my own server
>where I'm running the swish-e.
>
>It's when I set the IndexDir to URL other than my own server that I get
>"no word indexes" type of output.
If you are using the -S http method then swish is using a perl helper
program called swishspider. You can run this program alone to see if it's
fetching docs.
~/swish-e/src > ./swishspider
Usage: SwishSpider localpath url
~/swish-e/src > ./swishspider . http://swish-e.org/index.html
~/swish-e/src > ll -t | head
total 52672
-rw-r--r-- 1 lii users 5321 May 1 15:52 ..contents
-rw-r--r-- 1 lii users 638 May 1 15:52 ..links
-rw-r--r-- 1 lii users 14 May 1 15:52 ..response
that will tell you if it can fetch the remote doc.
>Also, I have to modify the Perl script in cgi-bin to make the HTTP crawler
>result
>show up correclty. I have to add this line:
>$url =~ s/http\:\/\/www\.losangeles\.af\.mil\///;
> into the while loop in
> sub search_parse.
Don't really follow that. You may be describing a cgi script I'm not
familiar with.
--
Bill Moseley
mailto:moseley@hank.org
Received on Wed May 1 23:01:28 2002