Skip to main content.
home | support | download

Back to List Archive

Re: HTTP spidering - zero results

From: Angel Parn <angel(at)not-real.mv.parnu.ee>
Date: Thu Jun 15 2000 - 11:17:40 GMT
From: "Mark" <admin@asarian-host.org>
> This is one of the first problems I ran into as well. I solved it by
> making
> the search string in swishspider less strict. From:
>
> if ($response->header("content-type") eq "text/html") {
> I changed it into:
> if ($response->header("content-type") =~ /text\/html/i) {

Thanks for your answer. But this couldn't solve my problem.
When I run swishspider perl script, it works, I get files
*.links, *.response, *.contents and header == "text/html".

That's strange, perl helper script is OK, bu running swish-e
executable:

/home/bin/swish-e
    -S http
    -i http://www.mysite.com/index.php3?date=2000/06/10
    -f /home/web/day.swe
    -c /home/web/search/pp.cfg

I'll get the file day.swe but it has no keywords, only ordinary
structure of .swe file (lots of numbers and stuff).

There must be something I'm missing. But what :(
It seems that swish-e misbehaves (?) When I remember, I compiled
the swish-e executable on another system than I currently run it.
Could it be that all functions (searching, indexing in FS mode) are
working but HTTP indexing is not ? I can check the exact Linux
version numbers on both systems.


Angel
Received on Thu Jun 15 07:20:14 2000