Hi Bill,
Thanks for the reply.
> Date: Fri, 5 Sep 2008 08:14:16 -0700
> From: Bill Moseley <moseley@hank.org>
> Subject: Re: [swish-e] Trac & wiki authorization
> To: Swish-e Users Discussion List <users@lists.swish-e.org>
> Message-ID: <20080905151416.GB25601@hank.org>
> Content-Type: text/plain; charset=us-ascii
>
> On Fri, Sep 05, 2008 at 03:59:41PM +0800, Tian Xinchun wrote:
> > Dear experts,
> >
> > I am trying to index the password protected areas like Trac & mediawiki, I
> > have succeeded in the public and other basic authorization web pages using
> > "credentials => 'username:password'". It seems that this solution does not
> > work for Trac & mediawiki. Any help?
>
> The credentials are for basic auth. You would need to alter your
> script to log by posting to the form.
>
> I'm not sure of the details, but what you likely need is to alter the
> spider to make a request (a POST) to the login form before you start
> to spider using the same user agent (and thus the same cookie jar)
> that the spider uses. Then that will save the cookie and you should
> be able to spider.
I will have a try, thanks.
>
> I'd also look at just indexing the data directly from the database
> instead of spidering, if you can figure out the URL mapping.
>
I am sorry that I can not fully understand what you mean, so could you give me
more info. or point me to some links, thanks.
Best Regards,
Xinchun
> >
> > Following is part of my spider.conf which has problem.
> > my %privatewiki = (
> > email => 'tianxc@ihep.ac.cn',
> > base_url => 'https://wiki.bnl.gov/dayabay-private/index.php?
> > title=Main_Page',
> > delay_sec => '1',
> > max_depth => '2',
> > credentials => 'username:password'
> > );
> >
> > my %repository = (
> > email => 'tianxc@ihep.ac.cn',
> > base_url => 'http://dayabay.ihep.ac.cn/tracs/dybsvn/browser/',
> > delay_sec => '1',
> > max_depth => '5',
> > credentials => 'username:password'
> > );
> >
> > Thanks
> > Xinchun Tian
> >
> >
> >
> > _______________________________________________
> > Users mailing list
> > Users@lists.swish-e.org
> > http://lists.swish-e.org/listinfo/users
> >
>
> --
> Bill Moseley
> moseley@hank.org
>
> Unsubscribe from or help with the swish-e list:
> http://swish-e.org/Discussion/
>
> Help with Swish-e:
> http://swish-e.org/current/docs
>
> ------------------------------
>
------- End of Forwarded Message -------
--
Open WebMail Project (http://openwebmail.org)
_______________________________________________
Users mailing list
Users@lists.swish-e.org
http://lists.swish-e.org/listinfo/users
Received on Mon Sep 8 11:54:12 2008