Skip to main content.
home | support | download

Back to List Archive

Re: robots.txt

From: Bill Moseley <moseley(at)not-real.hank.org>
Date: Mon Oct 31 2005 - 14:43:59 GMT
On Mon, Oct 31, 2005 at 06:34:59AM -0800, J Robinson wrote:
> Any tips on how I can debug this? Is there a debug
> flag for spider.pl that shows robots.txt being parsed
> and/or urls being matched against it, or anything like
> that?

set the debug to "skipped" and it will tell you when a file is skipped
due to robots.txt.

Then just run the spider on one file they say it's skiping.

When I've debugged this in the past I found that the robots.txt file was
not setup correctly.

-- 
Bill Moseley
moseley@hank.org

Unsubscribe from or help with the swish-e list: 
   http://swish-e.org/Discussion/

Help with Swish-e:
   http://swish-e.org/current/docs
   swish-e@sunsite.berkeley.edu
Received on Mon Oct 31 06:43:59 2005