Hi:
I tested your spider against my personal site and it worked fine. I tested
against
http://www.prophecyinthenews.com/index.asp
and it does not function:
Indexing Data Source: "HTTP-Crawler"
retrieving http://www.prophecyinthenews.com/ (0)...
(18 words)
Removing very common words... no words removed.
Writing main index... 14 unique words indexed.
Writing file index... 1 file indexed.
Running time: 4 seconds.
Indexing done!
[usr147@unix2502 search]$
Bob
-----Original Message-----
From: swish-e@sunsite.berkeley.edu
[mailto:swish-e@sunsite.berkeley.edu]On Behalf Of David Norris
Sent: Friday, June 23, 2000 10:55 PM
To: Multiple recipients of list
Subject: [SWISH-E] Patched Spider
PropheZine Webmaster wrote:
> 2. applied the spider and spider2 patches
The spider patch is already applied to http.c in 1.3.2.
I have a patched and tested swishspider at:
http://www.webaugur.com/wares/files/swishspider
I also changed the #! to /usr/bin/perl since that is a standard location
for the PERL binary. The reference in the distribution is SPARC
specific.
The spider works perfectly with "perl, version 5.005_03 built for
i386-linux" I can't test on BSD since the version of PERL on the system
is ancient (as is the system).
> if( substr($response-header("content-type"), 0, length("text/html")) eq
> "text/html" ) {
Looks correct to me.
--
,David Norris
Dave's Web - http://www.webaugur.com/dave/
Dave's Weather - http://www.webaugur.com/dave/wx
ICQ Universal Internet Number - 412039
E-Mail - dave@webaugur.com
Received on Sat Jun 24 06:46:52 2000