Skip to main content.
home | support | download

Back to List Archive

question about robots.txt

From: Bill Conlon <bill(at)not-real.tothept.com>
Date: Wed Jul 28 2004 - 01:24:01 GMT
I have a robots.txt file:

User-agent: *
Disallow: /archive/
Disallow: /css/
Disallow: /images/
Disallow: /inc/
Disallow: /js/
Disallow: /swish-e/

Yet, when I index with spider.pl, I see, for example

>> +Fetched 2 Cnt: 51 
http://beowulf3.tothept.com/archive/2002-December/author.html 200 OK 
text/html ??? parent:http://beowulf3.tothept.com/archive/index.html

which clearly shows both parent and child in /archive/ -- the first 
disallowed directory.

Can someone enlighten me?


Bill Conlon

To the Point
345 California Avenue Suite 2
Palo Alto, CA 94306

office: 650.327.2175
fax:    650.329.8335
mobile: 650.906.9929
e-mail: mailto:bill@tothept.com
web:    http://www.tothept.com
Received on Tue Jul 27 18:24:25 2004