Skip to main content.
home | support | download

Back to List Archive

Re: Re: exclusions in indexing

From: Michael J. Giarlo <leftwing(at)not-real.rci.rutgers.edu>
Date: Sun Oct 10 1999 - 18:31:08 GMT
At 01:56 PM ET, 09/30/1999, Ron Samuel Klatchko wrote:
>
>You can't for now.  Although swish will only follow <A> tags.  Just out
>of curiosity, if a site thought it would be helpful to have a link (not
>a form) to a CGI, why do you not want to follow it?

Would adding the following line near the bottom of the swishspider script
perhaps do what I want?  

        if ($link =~ /.*\.cgi.*/) { return; }

That way, when a document is spidered to retrieve links, any link
containing the '.cgi' string is ignored.  (Obviously, this isn't an ideal
solution, since it will affect the indexing of any site rather than just
the one I have in mind.  Not to mention that CGI scripts ending in other
extensions will get right by.  But for this case, both of those concerns
are irrelevant, and if this wee kludge does what I want it to, then I'm happy.)

And out of curiosity, why isn't the list configured to toss a 
"Reply-To: swish-e@sunsite.berkeley.edu" header into posts, so that when
people reply, it defaults to the list?  
Received on Sun Oct 10 11:37:25 1999