Skip to main content.
home | support | download

Back to List Archive

Re: swish-e only spiders the server it started on

From: Bill Moseley <moseley(at)not-real.hank.org>
Date: Tue May 16 2006 - 14:20:27 GMT
On Tue, May 16, 2006 at 09:49:28AM +0200, Cas Tuyn wrote:
> >  base_url => [qw! http://aaa.company.com/intranet/index.html
> >http://bbb.company.com/ http://ccc.company.com/ !],
> >
> >And see what happens tonight.
> 
> but ran into authorization problems on the 2nd and 3rd server,
> although all three servers are single sign-on. This is what the IT
> admin replied:
> 
> 2 Things seem to be of importance here:
> - We have 3 different servers with different content in the same path
> (e.g. /index.html on aaa.company.com, bbb.company.org, ccc.company.com)
> - The authentication used is equal on all 3 systems but seems to be used
> for only for the 1st URL in the list. The 2nd and 3rd host return "401
> Unauthorized"

This is all the spider does:

        my @urls = ref $s->{base_url} eq 'ARRAY' ? @{$s->{base_url}} :( $s->{base_url});
        for my $url ( @urls ) {

            # purge config options -- used when base_url is an array
            $valid_config_options{$_} ||  delete $s->{$_} for keys %$s;

            $s->{base_url} = $url;
            process_server( $s );
        }
    }

So it's the same config for each one.  Maybe the auth is reset
somehow during the run.

> I just reread the whole documentation, but could not find anything
> about authentication on multiple servers. Who hasa similar setup and a
> solution?

Seems like it would not be very hard for you to debug.  Set up a few
test servers with auth (just create three domains in your hosts file
all pointing to the same local web server) and have the spider print
out the request and response headers -- or even just throw in a few
print statements into the script to print out when auth is being set.


-- 
Bill Moseley
moseley@hank.org

Unsubscribe from or help with the swish-e list: 
   http://swish-e.org/Discussion/

Help with Swish-e:
   http://swish-e.org/current/docs
   swish-e@sunsite.berkeley.edu
Received on Tue May 16 07:20:38 2006