Skip to main content.
home | support | download

Back to List Archive

Re: Indexing link contents

From: Bill Moseley <moseley(at)not-real.hank.org>
Date: Fri Mar 07 2003 - 15:52:59 GMT
[Please respond to the list, too]

On Fri, 7 Mar 2003, Ander wrote:
>          Yes, I've already read 'perldoc spider.pl' and I've tryed setting 
> up same_hosts, but I donīt find the right configuration.I'll explain: with 
> static hosts I don't have any problem, I define them in same_hosts and It's 
> all right but I can't spider 'dinamic hosts'  (sites that I donīt add 
> manually).

Most web sites have external links so there needs to be a way to limit the
spider's reach.

>          Can I say spider.pl to spider any link it founds without typing it 
> manually?

Maybe not in the config file, but spider.pl is just a perl script --
use an editor and poke around (for things like "same_hosts") and you can
disable that feature.


-- 
Bill Moseley moseley@hank.org
Received on Fri Mar 7 15:56:52 2003