Skip to main content.
home | support | download

Back to List Archive

crawling protected site

From: intervolved none <intervolved(at)>
Date: Tue May 10 2005 - 21:36:10 GMT
I need to crawl a website that is protected by windows authentication but when swish-e tries to crawl it it returns a 401 error.  I pass in the username and password the same way that I have tried using IE ( ) and swish-e does not work.  I have attached a condensed config file and the output that is generated when I run the command to index the site.  Thanks in advance.
c:> type mytestsite.config   (subset of config file)

MaxDepth 0
Delay 0
IndexContents HTML2 .htm .html .shtml
IndexContents TXT .pdf 
IndexFile newprimarycare.index
StoreDescription HTML2 <body> 200
StoreDescription TXT 200
DefaultContents HTML2 

c:> swish-e.exe -v 3 -S http -c "mytestsite.config"
Now fetching ;"... Status: 401.


Yahoo! Mail Mobile
 Take Yahoo! Mail with you! Check email on your mobile phone.

Due to deletion of content types excluded from this list by policy,
this multipart message was reduced to a single part, and from there
to a plain text message.
Received on Tue May 10 14:36:22 2005