Skip to main content.
home | support | download

Back to List Archive

Re: Filters/HTTP (was:Documentation structure)

From: <Rainer.Scherg(at)not-real.rexroth.de>
Date: Thu Dec 14 2000 - 15:23:04 GMT
> -----Original Message-----
> From: Bill Moseley [mailto:moseley@hank.org]



> >Passing the real path/URL is just for information or special 
> purpose, mostly not used by the filter program.
> 
> So does that mean you can't currently use filters in the 
> httpd access method?

of course it should work (but I personally only have tested the
file indexing mechanism...). What I meant was, that the temp file
created locally by the spider process (he has to store the document
retrieved from the web somewhere) is passed to the filter and
also as a second parameter the original address (URL) - if you have
to know the original url.

e.g.:    filterprog   /tmp/swishspider.12345.docname  http://server/path.htm



> Exactly.  That's exactly why I think httpd should be moved 
> out of swish --
> or at least provide an interface to an external document 
> source provider.


Mhh, this are IMO two methods.

IMO  httpd should be included as part of swish and shouls be internal.
We should also provide an external feed (like swish is using spider perl).

If the internal method is not sufficient, the admins could switch to an
external
process. Parts of wget for the internal spider would be
nice...

cu - rainer


----------------------------------------------------------------------
This Mail has been checked for Viruses
Attention: Encrypted Mails can NOT be checked !

* * *

Diese Mail wurde auf Viren ueberprueft
Hinweis: Verschluesselte Mails koennen NICHT geprueft werden !
----------------------------------------------------------------------
Received on Thu Dec 14 15:25:48 2000