On Thu, Jan 04, 2007 at 10:07:22AM -0800, James wrote:
> I am also wondering if there is a way to get Swish-e's spider to
> automatically follow links to subdomains of the same domain, without having
> it follow off-site links to other domains. Do you know what I mean?
The spider is just perl, so it's easy to change:
# Here we make sure we are looking at a link pointing to the correct (or equivalent) host
unless ( $server->{scheme} eq $u->scheme && $server->{same_host_lookup}{$u->canonical->authority||''} ) {
How about something like:
unless ( $server->{scheme} eq $u->scheme && $u->host =~ /mydomain\.com$/ ) {
You might want to print out what $u->host returns.
run "perldoc URI" and take a look at things like:
$uri->authority
$uri->host
$uri->host_port
--
Bill Moseley
moseley@hank.org
Unsubscribe from or help with the swish-e list:
http://swish-e.org/Discussion/
Help with Swish-e:
http://swish-e.org/current/docs
swish-e@sunsite.berkeley.edu
Received on Thu Jan 4 10:31:36 2007