Dr Michael Daly wrote on 3/14/12 9:14 PM:
> The funny thing is that *no* Filefilter options are specified in my
> swish1.conf:
>
> IndexOnly .htm .html .txt .doc .pdf .xls
> IndexContents TXT* .txt
> DefaultContents HTML*
>
> I can see both /opt/bin/catdoc and /opt/bin/pdttotext , with /opt/bin
> being in $PATH so I presume there must be some hard coding within swish-e
> that picks them up without the configuration of eg FileFilter
>
> Should these directives be added?:
> FileFilter .pdf pdf2html
> FileFilter .pdf pdftotext "'%p' -"
> FileFilter .doc /opt/bin/catdoc "-s8859-1 -d8859-1 %p"
>
> If not, can the parsing errors be ignored?
>
swish-e is trying to parse your .pdf as HTML, because you've not specified a
filter. You must specify a filter for anything that is not txt, html or xml.
--
Peter Karman . http://peknet.com/ . peter(at)not-real.peknet.com
_______________________________________________
Users mailing list
Users(at)not-real.lists.swish-e.org
http://lists.swish-e.org/listinfo/users
Received on Thu Mar 15 2012 - 03:17:12 GMT