Skip to main content.
home | support | download

Back to List Archive

Recursive Spidering in Windows

From: <narayananps(at)not-real.hp.com>
Date: Mon Oct 27 2003 - 15:28:47 GMT
Hi, 
   I have a small problem. I have installled swish-e 2.2.3 on windows 2000
and configured it to index one of the intranet sites..
I use the -S prog option and have the spider.pl and spider.conf files
(taken from the example directory, but customised for my site).
I try to use a very simple spider.conf ..something like :
----------------------------------------------------------------------------
----
       @servers = (
        {
 
               base_url        	=> 'http://domain:port/abcd/index.html',
               email           	=> 'my@email.com',
               delay_min       	=> .01,
        },
    );
    1;
 
----------------------------------------------------------------------------
----

But I am not able to get the spider recurse thru all the links in
index.html.
I see from the perl doc that the default html tags for links is <a> So I
dont specify it in my conf.
Still I am not able to do a recursive spidering.

Also, is there a utility to configure the spider via a proxy on windows ( i
want to spider an external site from inside a firewall) ? 

Help please :(


Cheers
Narayanan
Received on Mon Oct 27 15:41:11 2003