Skip to main content.
home | support | download

Back to List Archive

Re: Callback Functions For Indexing

From: Peter Karman <peter(at)>
Date: Wed Feb 08 2006 - 18:54:01 GMT
The problem is that the doesn't know anything about whether a file was 
indexed or not. The spider just pulls down files via http and then handles them 
as you've configured, then prints the content out, either to swish-e or a file 
or some other 'thing' you've designated.

If a file fails to be indexed, the swish-e command will issue an error/warning, 
depending on how you've configured it (see ParserWarnLevel and Verbose settings 
in swish configuration docs). So you could parse the output of swish-e to see 
what succeeds and what doesn't.

If what you're really asking is how to check if successfully retrieved 
and handled each url, then the test_response function is likely what you want. 
There are some examples in the docs.

Hope that helps.

andy rosbrook scribbled on 2/5/06 10:41 AM:

> Ok, ive been experimenting with the callback functions. I'm a bit confused 
> as to what i can do through them?
> I'd just like to know if there is anyway i could write to a database with 
> two fields:
> indexing_result | indexing_error
> and if the indexing is successfuly i can just write TRUE to the database, if 
> the indexing is not successfull then i would like to write FALSE to the 
> database and write the error that meant indexing failed.
> I've been trying to pipe out STDERR and have been trying to create an 
> indexing API, just wondering if the problem above can be solved through 
> callback functions? Is it possible to write out the DEBUG info to a database 
> through a callback function?
> sorry for all the questions!!
> andy
>>From: Bill Moseley <>
>>To: Multiple recipients of list <>
>>Subject: [SWISH-E] Re: Callback Functions For Indexing
>>Date: Fri, 27 Jan 2006 10:23:41 -0800 (PST)
>>On Fri, Jan 27, 2006 at 09:44:44AM -0800, andy rosbrook wrote:
>>>Well i just want to know after each URL in spider.config weather the
>>>spidering was a success or a failure. I know i could just check for a
>>>complete index.swish-e but this doesnt allow me to capture any error
>>use test_response callback in your config.  That's called right after
>>the HEAD or GET request has returned (and sometimes before all the
>>data has be fetched from the remote server).
>>Bill Moseley
>>Unsubscribe from or help with the swish-e list:
>>Help with Swish-e:
> _________________________________________________________________
> Are you using the latest version of MSN Messenger? Download MSN Messenger 
> 7.5 today!

Peter Karman  .  .  peter(at)
Received on Wed Feb 8 10:54:04 2006