Skip to main content.
home | support | download

Back to List Archive

RE: FileFilter with http

From: <Rainer.Scherg(at)>
Date: Thu Sep 07 2000 - 13:36:52 GMT

The filter feature in 1.3.2f was only testet on filesystem (because I didn't
http spidering - see readme file). But the filter should also work for
http indexing, because it's the same mechanism.

To track down the problem:

  - please upgrade to swish-e 2.0.1:

If this problem still exists, use a simple filter shell script to test
the files to filter (just send the files/results via "cat $1 | strings 1>&2"
to stderr and print the arguments passed to the filter script...

cu Rainer

-----Original Message-----
From: []
Sent: Wednesday, September 06, 2000 6:48 PM
To: Multiple recipients of list
Subject: [SWISH-E] FileFilter with http

I'm trying to use file filtering to index PDFs etc.
This wotks fine while I access the files through the file system, but
doesn't work when accessed by http. The PDF are retrieved, but don't appear
to be filtered (the 2 words indexed symptom).

Have I missed something, or isn't this expected to work?

I'm using using swish-e_1_3_2_f



DISCLAIMER: This message contains proprietary
information some or all of which may be
confidential and/or legally privileged. It is for
the intended recipient only who may use and apply
the information only for the intended purpose.
Internet communications are not secure and
therefore the British Biotech group does not
accept legal responsibility for the contents of
this message. Any views or opinions presented are
only those of the author and not those of the
British Biotech group. If you are not the intended
recipient please delete this e-mail and notify the
author immediately by calling ++44 (0)1865 748747;
do not use, disclose, distribute, copy, print or
rely on this e-mail.

This Mail has been checked for Viruses
Attention: Encrypted Mails can NOT be checked !

* * *

Diese Mail wurde auf Viren ueberprueft
Hinweis: Verschluesselte Mails koennen NICHT geprueft werden !
Received on Thu Sep 7 13:38:22 2000