Skip to main content.
home | support | download

Back to List Archive

Re: DirTree works in pipe but not config file on PDF

From: Peter Karman <peter(at)not-real.peknet.com>
Date: Fri Jun 30 2006 - 21:30:43 GMT
I was suggesting that the -v3 option would tell you if swish-e was in 
fact parsing swish_test.pdf or if somehow it was being passed something 
different. I just tried your example here and it worked for me, so I was 
suggesting a way for you to start to debug what's going on.

Gertjan Hofman scribbled on 6/30/06 3:59 PM:
> 
> Peter -
> 
> Not sure I understand - I am passing only 1 file -
> swish_test.pdf (as indiced in the config file I
> enclosed).  Of course I started with entire folders
> but for sake of demonstration of the problem only
> parse the one file
> 
> I note there are older messages in the mailing list
> with similar sounding problems - in that case
> spider.pl failed from a config file but worked in a
> pipe...
> 
> Thanks
> 
> Gertjan
> 
> 
> --- Peter Karman <peter@peknet.com> wrote:
> 
>>
>> Gertjan Hofman scribbled on 6/29/06 11:59 PM:
>>
>>> TRY 1: USING CONFIG FILE
>>>
>>> gertjan-laptop:~/tmp/swish_test> swish-e -S prog
>> -c
>>> swish_file.conf
>>> Indexing Data Source: "External-Program"
>>> Indexing "./DirTree.pl"
>>> External Program found: ./DirTree.pl
>>> Error: May not be a PDF file (continuing anyway)
>>> Error (0): PDF file is damaged - attempting to
>>> reconstruct xref table...
>>> Error: Couldn't find trailer dictionary
>>> Error: Couldn't read xref table
>>> Removing very common words...
>>> no words removed.
>>> Writing main index...
>>> err: No unique words indexed!
>>>
>>
>> add the -v3 option to get more verbose. That should
>> tell you the name of 
>> the file being parsed with SWISH::Filter (xpdf). I'm
>> betting the file 
>> isn't getting passed correctly.
>>
>> -- 
>> Peter Karman  .  http://peknet.com/  . 
>> peter@peknet.com
>>
> 
> 
> __________________________________________________
> Do You Yahoo!?
> Tired of spam?  Yahoo! Mail has the best spam protection around 
> http://mail.yahoo.com 
> 

-- 
Peter Karman  .  http://peknet.com/  .  peter(at)not-real.peknet.com
Received on Fri Jun 30 14:30:45 2006