Skip to main content.
home | support | download

Back to List Archive

Re: Can't connect to www.swish-e.org:80

From: <moseley(at)not-real.hank.org>
Date: Wed Aug 20 2003 - 13:31:50 GMT
On Wed, Aug 20, 2003 at 12:58:16AM -0700, Bucharow Leonard wrote:

> Now I'm trying to spider some internet server with -S prog and spider.pl and
> this fails.
> Of cource I'm using a proxy server to connect to the internet.

Of course.

I don't use a proxy so I have not tested this.

You will have to enable proxies in the LWP::UserAgent object, and you 
can do this in the configuration.  For example, in your config say you 
have a test_url() callback function to ignore files that look like 
images (based on the file name).

In that same function you can enable the proxy.

    test_url => sub {
        my ($uri, $server) = @_;

        # enable proxy requests
        
        unless ($::proxy_set++) {
            my $ua = $server->{ua};
            $ua->proxy('http', 'http://proxy.myhost.com:8001');
        }

        # return true if not an image, otherwise false
        return $uri->path !~ /\.(gif|jpeg|png)$/;
    },

You can also sett LWP::UserAgent to read the proxy data from the 
environment.  See perldoc LWP::UserAgent for details.

That $::proxy_set is just a variable in the "main" package.  It's there 
to only set the proxy settings once.  Then the call to $ua->proxy() sets 
the proxy for the http protocol.  It probably wouldn't run any 
differently if you didn't use that.  i.e.:

    test_url => sub {
        my ($uri, $server) = @_;   

        # enable proxy requests
        my $ua = $server->{ua};
        $ua->proxy('http', 'http://proxy.myhost.com:8001');

        return 0 if $uri->path =~ /\.(gif|jpeg|png)$/;
        return 1;
    },



-- 
Bill Moseley
moseley@hank.org
Received on Wed Aug 20 13:35:29 2003