Skip to main content.
home | support | download

Back to List Archive

Re: enhancement request

From: Bernhard Weisshuhn <bkw(at)>
Date: Thu Sep 16 2004 - 10:00:10 GMT
On Wed, Sep 15, 2004 at 03:49:37PM -0700, SRE <> wrote:

> >On Wed, Sep 15, 2004 at 01:37:50PM -0700, Richard Morin wrote:
> >> If I had the URL, I'd know where to start looking...
> At 01:57 PM 9/15/2004, Bill Moseley wrote:
> >Well that's easy: it's http://<yourserver>/robots.txt
> It's not my discussion, but let me make an educated guess...
> Richard is saying he thinks SWISH should print the URL of
> the file it was attempting to spider when the problem occurred.
> I tend to agree with him. 
> That's why he said this:
> At 01:43 PM 9/15/2004, Richard Morin wrote:
> >The spidering script certainly knows where it's looking, at
> >any given time.  Does the module not return a status code?  Sigh.
> In the commercial software I've written I found it useful to have 
> every nested level print something and return an error code as a 
> low-level error ripples up to the calling tool aborting. [...]

Yes yes, we all are all in favour of proper exception handling and
debugging information, unit testing and whatnot. The point is this:
IT IS THE FARKING LIBRARY that is retrieving robots.txt, it is an
*external* product. Bill gave the reference where the mechanism is
described. So either talk to the authors of the LWP modules, wait
until someone reinvents the wheel *with* proper exception handling and
reporting, or - tadaa - do it yourself. 

That's all.
Received on Thu Sep 16 03:00:37 2004