Skip to main content.
home | support | download

Back to List Archive

Re: Disallow in Robots.txt

From: James <swish.enhanced(at)>
Date: Mon Jan 08 2007 - 13:52:49 GMT
Thanks, Brad!  Yes, I had seen those lines in before.
I am also wondering where the "2.2" is being generated from (that I see in
the access logs).  I always see swish-e spider 2.2
..I'll be curious to get Bill's response to this, to confirm.  I am not
confident that this is the total answer, since I always see a whole lot
written in the access logs from Yahoo, MSN and Google and yet their
UserAgent is just a one word (short) term to exclude (like the psbot) in the
Robots.txt.  So, it seems there is more to this.

On 1/8/07, Brad Miele  wrote:
> Pretty sure you can set agent in, yep, line 143:
> agent       => 'swish-e spider'
> regards,
> Brad
> ---------------------
> Brad Miele
> VP Technology
> On Mon, 8 Jan 2007, James wrote:
> > Is there a way for other web-masters to disallow Swish-e from crawling
> their
> > site(s) and is there a way to declare what bot I am?  For instance, I
> always
> > put the following in my robots.txt files for my web-sites:
> >
> > User-agent: psbot
> > Disallow: /
> >
> > Is there some kind of configuration file that declares what bot
> (User-agent)
> > I am (when using Swish-e) and can that be changed to something I
> customize
> > and something I can declare publicly so that anyone can disallow my user
> > agent?
> >
> > I ask these things in general because I know that Swish-e has a polite
> > spider, obeying Robots.txt and noindex, nofollow directives.
> >

Due to deletion of content types excluded from this list by policy,
this multipart message was reduced to a single part, and from there
to a plain text message.
Received on Mon Jan 8 05:52:55 2007