Skip to main content.
home | support | download

Back to List Archive

Re: [swish-e] How do I index via HTTP when authentication is

From: Adam Douglas <ADouglas(at)>
Date: Wed Feb 06 2008 - 22:26:39 GMT
Hi Bill. Thanks for your reply. see below for comments/questions.

> I'm not sure why it's any more dangerous to require/allow the 
> swish-e spider to login to an application than any other user 
> agent that presents credentials.  In fact for a public facing 
> application, far more checks can be applied 
> (username/password;IP_address;one-of-a-
> kind user agent) to the spider than is feasible with a normal 
> user's login.
> Merely enabling cookies by itself presents just as much risk 
> of forgery.

This is true. I just don't like to open up GET posts when its not
necessary. All logins are done via a POST and by going to /login/ URL.
Logouts are /logout/ and nothing more in the URL. I'll maybe give this a
try then and see what I can make of it.
> Anyway, here's a snip from my @servers:
> @servers = (
>          {
>          base_url    => ' 
> _function=checkpw&userid=swishe&password=swishe&remember=no',
>          use_cookies => 1,
> #        debug => DEBUG_URL | DEBUG_SKIPPED | DEBUG_FAILED |  
>          delay_sec => 1,
>          test_url    => sub {
>                  my  $ok =  !($_[0]->path =~ / && 
> $_[0]-  >query =~ /_function=logout/);
>                  return 1 if $ok;
>                  return; },
> ...
> Essentially, the spider logs in as the user 'swishe' so it 
> sees the same content as any similarly privileged user. 
> remember=no means don't give swish-e a long-term cookie to 
> re-authenticate with.
> use_cookies allows the application to provide, and swish-e to 
> use the session cookies needed for access test_url keeps the 
> spider from following a link to log out, to assure we follow 
> all links.

Ahhh okay this makes more sense now. oh, mmm I wonder if I can still
apply that test_url when all I have is a path. I guess I'll find out.

Thanks Bill!

This message (including any attachments) is intended only for the use of the individual or entity to which it is addressed and may contain information that is non-public, proprietary,privileged, confidential, and exempt from disclosure under applicable law or may constitute as attorney work product. If you are not the intended recipient, you are hereby notified that any use, dissemination, distribution, or copying of this communication is strictly prohibited. If you have received this communication in error, notify us immediately by telephone and
(i) destroy this message if a facsimile or (ii) delete this message
immediately if this is an electronic communication. Thank you.
Users mailing list
Received on Wed Feb 6 17:26:40 2008