Skip to main content.
home | support | download

Back to List Archive

Pdf problem

From: Andrea Pasquini <pasquinigalde(at)not-real.virgilio.it>
Date: Tue Dec 14 2004 - 10:52:14 GMT
Hi, 
I use swish-e_2.4.2 and I've  problem with the pdf files.
After launch of  $ ./swish-e -Sprog -c swish.conf  this error  is in the
output and the crawler go on :
...
Error: Couldn't find cidToUnicode file for the 'Adobe-WinCharSetFFFF' collection
Error: Unknown character collection 'Adobe-WinCharSetFFFF'
Error: Unknown font tag 'R137'
Error: May not be a PDF file (continuing anyway)
Error (0): PDF file is damaged - attempting to reconstruct xref table...
Error: Couldn't find trailer dictionary
Error: Couldn't read xref table
http://www.di.unipi.it/sindacati/21set2004.pdf - Using HTML2 parser -  (no
words indexed)
...
I use pdftotext for filter the pdf file. 

And the configuration in the swish.conf is :

FileFilter .pdf       pdftotext   " '%p' -"
IndexContents TXT2 .pdf

Which is the problem ?

Thanks in advance .

Cheers Andrea 
Received on Tue Dec 14 02:52:15 2004