At 11:10 AM 11/13/02 -0800, David THOMAS wrote:
>When I try to index a PDF :
>Skipping http://localhost/pdf/test.pdf: Wrong content type:
>application/pdf.
>
>although I have the FileFilter directive configured:
>FileFilter .pdf c:\path\xpdf\pdftotext "'%p' -"
>and this is a real PDF file.
Use either the -S prog and spider.pl method of spidering or use the
SWISH::Filter module with -S http. Just to confuse things more, with -S
prog and spider.pl you can either use SWISH::Filter or the pdf2html module
to convert the pdf files. Both are given as examples in the
prog-bin/SwishSpiderConfig.pl file.
FileFilter should work with -S http. I'll see if I can't get a patch.
--
Bill Moseley
mailto:moseley@hank.org
Received on Thu Nov 14 00:24:08 2002