> >Passing the real path/URL is just for information or special 
> purpose, mostly not used by the filter program.
> So does that mean you can't currently use filters in the 
> httpd access method?

of course it should work (but I personally only have tested the
file indexing mechanism...). What I meant was, that the temp file
created locally by the spider process (he has to store the document
retrieved from the web somewhere) is passed to the filter and
also as a second parameter the original address (URL) - if you have
to know the original url.

e.g.:    filterprog   /tmp/swishspider.12345.docname  http://server/path.htm

> Exactly.  That's exactly why I think httpd should be moved 
> out of swish --
> or at least provide an interface to an external document 
> source provider.

Mhh, this are IMO two methods.

IMO  httpd should be included as part of swish and shouls be internal.
We should also provide an external feed (like swish is using spider perl).

If the internal method is not sufficient, the admins could switch to an
process. Parts of wget for the internal spider would be

cu - rainer

