At 10:03 AM 04/22/02 -0700, Linda DeBoer wrote:
> Whenever I run swish-e against a site which has a url pointing back
>to the home page, it loops.
You don't mean "loop" in that it indexes the same URL more than once, right?
I don't know how to make -S http method do that. Any robots.txt tricks?
But, if you are using 2.1-dev, and the -S prog method with spider.pl then
it's rather easy to do this.
In the config you can say:
test_url => sub {
my $uri = shift;
return $uri->path =~ m!^/some/path!;
}
Which just says that all paths must begin with /some/path/*
Another option, which would be fast, would be to run another web
server/virtual host on a different port, and change the document root.
--
Bill Moseley
mailto:moseley@hank.org
Received on Mon Apr 22 17:24:39 2002