Skip to main content.
home | support | download

Back to List Archive

Re: Spider, but not index?

From: David VanHook <dvanhook(at)>
Date: Wed Jun 23 2004 - 14:39:28 GMT
Wonderful, thanks -- I'll give it a try!

Dave V.

-----Original Message-----
[]On Behalf Of David Wood
Sent: Wednesday, June 23, 2004 10:05 AM
To: Multiple recipients of list
Subject: [SWISH-E] Re: Spider, but not index?

In your spider config file, put something like this:

@servers = (

         test_response => \&test_response,


sub test_response {

     @SNUBBED_URLS = (

     my $uri = $_[0];
     my $server = $_[1];
     my $url = "";

     # These URLs should be spidered, but not indexed, as they're too
     foreach $url (@SNUBBED_URLS) {
         $server->{no_index} = 1 if ($uri->path =~ /$url$/);





At 15:40 Wednesday 23-6-2004, David VanHook wrote:

>Is there a relatively easy way to get SWISH-E to spider a page (i.e., to
>follow all of the links on it), but to not index the contents of that same
>page?  I've tried using FileRules title in the config file, but am having
>luck -- I get a Bad Directive error, even when I paste in the code directly
>from the online docs.
>Dave VanHook

Due to deletion of content types excluded from this list by policy,
this multipart message was reduced to a single part, and from there
to a plain text message.
Received on Wed Jun 23 14:39:33 2004