Skip to main content.
home | support | download

Back to List Archive

Are you using a custom SWISH::FIlter?

From: Bill Moseley <moseley(at)not-real.hank.org>
Date: Tue Oct 05 2004 - 18:59:40 GMT
I updated some of the code in SWISH::Filter to work a bit differently.
This effects how individual filter modules interface with
SWISH::Filter.

So, if you have any custom filters created for use with SWISH::Filter
you will need to update your filters for the next version of Swish.

Also, if you do have a custom filter then perhaps it could be included in
the distribution.

I've also been updating spider.pl -- not as much as I'd like -- but I
added a few features.

For example, you can now test a content-type with SWISH::Filter to see
if it *would* try and filter it.  So, spider.pl now does HEAD requests
to fetch the content type before actually fetching the document.

Before, spider.pl would always use a GET request.  The GET request
could be aborted before downloading all the content, but that breaks
the existing keep-alive connection.

Another feature that will help with new users is the ability to modify
the default configuration of the spider instead of just being able to
override it.

Before:

    spider.pl default <URL>

would automatically use SWISH::Filter for converting PDFs and MS Word
docs.  But if you use your own config:

   spider.pl spider.config

then you had to arrange to use SWISH::Filter in spider.config if you
wanted to filter.  A bit confusing, so you can now merge spider.config
with the "default" config if you only want to change a few things from
the default config.

You can look at the dev docs if curious about the changes.

  http://swish-e.org/dev/docs/


-- 
Bill Moseley
moseley@hank.org

Unsubscribe from or help with the swish-e list: 
   http://swish-e.org/Discussion/

Help with Swish-e:
   http://swish-e.org/current/docs
   swish-e@sunsite.berkeley.edu
Received on Tue Oct 5 11:59:51 2004