On Oct 27, 2010, at 9:19 AM, Peter Karman wrote:
> Troy Wical wrote on 10/27/2010 10:16 AM:
>
>> Ok, corrected some of the syntax issues in the spider config, but need to figure out how to properly activate the debug items.
>
> depending on your shell, you set the env vars differently. for
> bash-style it's:
>
> % SPIDER_QUIET=1 swish-e -c /home/search/t2.conf -S prog
Thanks, that got it. I also figured on the syntax for activating debugging. Perhaps a clearer example could be added to the spider.pl documentation page, for users at my level :) In the end, this is what I have now.
################################################
[root@purple /home/search]# more t2.spider.config
@servers = (
{
base_url => 'http://type2.com/ezmlm-archives/index.cgi?list=type2',
use_default_config => 1,
email => 'troy@wical.com',
delay_sec => 0,
max_depth => 2,
keep_alive => 1,
debug => DEBUG_URL | DEBUG_FAILED | DEBUG_SKIPPED,
},
);
################################################
Below are some of the results.
################################################
http://type2.com/ezmlm-archives/index.cgi?list=type2&cmd=monthbythread&month=201009:152: error: htmlParseEntityRef: expecting ';'
<a class="menu-months" href="?list=type2&cmd=months">[Months]</a>
http://type2.com/ezmlm-archives/index.cgi?list=type2&cmd=monthbythread&month=201009:157: error: Unexpected end tag : div
</div>
^
Warning: Unknown header line: 'ath-Name: http://type2.com/ezmlm-archives/index.cgi?list=type2&cmd=monthbydate&month=201009' from program spider.pl
err: External program failed to return required headers Path-Name:
.
[root(at)not-real.purple /home/search]# >> +Fetched 1 Cnt: 11 GET http://type2.com/ezmlm-archives/index.cgi?list=type2&cmd=monthbythread&month=201008 200 OK text/html ??? parent:http://type2.com/ezmlm-archives/index.cgi?list=type2 depth:1
################################################
Troy
_______________________________________________
Users mailing list
Users@lists.swish-e.org
http://lists.swish-e.org/listinfo/users
Received on Wed Oct 27 11:33:57 2010