Skip to main content.
home | support | download

Back to List Archive

Re: random crashing of spider.pl!?

From: Justin Tang <justin.tang(at)not-real.positionresearch.com>
Date: Mon Jan 17 2005 - 19:06:20 GMT
I'm not following what you are saying about forking the spider out as
a zombie.  Are you saying you are running the spider in the
background?

--Basically I made a function that would dynamically create the config file
then I use the `` operator to invoke spider.pl with the newly created config
file, and the function dies after the invocation while spider.pl continues
to run.  This is where the problem occurs, the process for spider.pl dies
spontaneously without any warning.  So, I tried to run the whole thing via
command line, and as far as I can tell is that when spider.pl is unable to
connect to a site, it defaults to another page and hangs while it asks me
for a username and password.  If I set spider.pl in the background(running
it commandline with the & command), then the process is put to sleep while
it wait for the username.  But if I run it with a terminal, it would
sometimes crash with an "Alarm Clock"(which I'm not too sure what it
means..) message.

I assume you are NOT running on Windows.  On windows there's no
alarm() function (IIRC) so it would just sit there and wait, I
suppose.

If the spider is running in the background then I'd think that any
password request would just timeout and continue.  But, maybe there's
an issue if there's no controlling terminal.  I have not checked that.

Maybe you should add a few debugging lines in the
get_basic_credentials() subroutine and see if it's stopping there, and
where.

> I've been stuck on this for so long... If anyone can help me out of it, I
> would be so grateful...

General ideas:

How are you starting the spider?

Can you make it happen if you specify just one file in the spider config
(the protected file)?
--I am only specifying one file

Set a SIGUSER to do a backtrace if the spider is hanging.
--Is there any good documentation on how to set a SIGUSER?

Add more debugging statements to see where things fail.
--It seems to fail between test_url and test_response


Thanks Bill for all your replies.

-Justin


--
Bill Moseley
moseley@hank.org

Unsubscribe from or help with the swish-e list:
   http://swish-e.org/Discussion/

Help with Swish-e:
   http://swish-e.org/current/docs
   swish-e@sunsite.berkeley.edu
Received on Mon Jan 17 11:06:22 2005