On 05/12/2008 11:25 AM, Francisco M. Vives wrote:
> Hi Guys,
> Is there a place with all the available filters to use with SWISH-E?
> I need to know all the types of files that can be filtered in that way,
> for example, is there any filter that performs OCR on jpg files that can
> be used with SWISH-E?
> One more thing, how flexible is SWISH-E with filters when using
> different versions of the filters?
> Last time I tried to index a PDF document I got some erros with the
> filter that seemed to be happen because the filter didn't work for that
> version of PDF. So, what happens if I try to use the newest
> pdftotext.exe and where can anybody get the updated versions of the filters?
Judging from your question about pdftotext.exe, I assume you are using Windows. I believe
all the available filters are included with the Swish-e Windows installer. The list is
fairly modest. Look at the filters/swish-filter-test script in the distribution.
As far as OCR on jpg files, I do not of any filter for that. The Swish-e filters just use
other, 3rd party programs to normalize non-text formats into something txt/html/xml for
Swish-e to parse. So if you can find a piece of software that does the OCR, you can write
a filter for it for Swish-e.
Peter Karman . peter(at)not-real.peknet.com . http://peknet.com/
Users mailing list
Received on Mon May 12 13:56:32 2008