Re: [swish-e] .zip filter (was: "Select All" check box for swish.cgi)

From: Peter Karman <peter(at)>
Date: Fri Jun 25 2010 - 00:22:19 GMT
Bharatwaj Narayanan Iyengar wrote on 6/24/10 7:04 AM:
> Hi All ,
> My swish tool was running fine for 2 months
> Today I got a request to add indexing to zip files
> I just added .zip to the config file
> But even though swish said it indexed 7000 words
> None of the documents words in the zip came up on a search
> Search used was : swish-e.exe -w  swish
> Am I missing something while indexing files of zip ?  :(:(

Please do no hijack discussion threads.[0] It makes it harder to follow the
conversation. Change the subject line if you are going to reply to a post (as I
did in this email).

As David Brown noted, you need a zip-specific filter, either with a FileFilter
or using a SWISH::Filter module. The SWISH::Filter::Decompress module has
several TODO items around .zip and .tar formats, because they (potentially)
represent multiple documents.

The best route is probably to write a script to unpack the file and concatenate
its contents into a single document for swish-e to index.


