Skip to main content.
home | support | download

Back to List Archive

RE: Swish-e Filtering on Win2003

From: Philippus, Brian <BPhilippus(at)not-real.nevp.com>
Date: Thu Mar 18 2004 - 20:47:11 GMT
I renamed that program to swish-filter-test.pl and ran it and it filtered
the file just fine.  So I ran it a second time, and it hung.  In the task
manager, I found pdftotext.exe running.  When I kill pdftotext.exe,
swish-filter-test.pl completes and indicates that it was successful.  I did
this several times and found that swish-filter-test.pl would complete
without me killing pdftotext.exe once in the 10 times I ran it.  I ran
pdftotext.exe directly on the file using the syntax I found in the filter
(pdftotext.exe filename -) and it ran quickly and completed every time.
I've tried two versions of pdftotext.exe, one dated 3/21/2003 552kb that
came with Swish-e and one dated 1/22/2004 488kb that I got out of
xpdf-3.00-win32.zip.

I tried using swish-filter-test.pl against some pdf files on the internet.
I get the same hanging behavior on:
http://www.openmobilealliance.org/syncml/download/whitepaper.pdf
But I don't get the behavior on many PDF files I found on the web, they run
through the script fine every time.

Interestingly, I never saw the "unitialized value" error that I get from
spider.pl.



-----Original Message-----
From: Bill Moseley [mailto:moseley@hank.org] 
Sent: Thursday, March 18, 2004 10:24 AM
To: Philippus, Brian
Cc: Multiple recipients of list
Subject: Re: Swish-e Filtering on Win2003

On Thu, Mar 18, 2004 at 10:03:18AM -0800, Philippus, Brian wrote:
> BTW, I have tried this on two servers, and get the same problem on both
> (probably because I did the same thing).

You might also try using the swish-filter-test program which comes with
swish-e:

$ swish-filter-test http://localhost/apache/test.pdf

Document http://localhost/apache/test.pdf was  filtered.
   Document:     http://localhost/apache/test.pdf
(http://localhost/apache/test.pdf)
   Content-Type: text/html
   Parser type:  HTML*

   >Filter used: SWISH::Filters::Pdf2HTML=HASH(0x8443b8c) ( application/pdf
-> text/html )

That might make it easier to debug.  Can you put a test document
someplace that I can access?  If you can get the above example with
swish-filter-test to fail then that might be easier for me to debug.


-- 
Bill Moseley
moseley@hank.org
Received on Thu Mar 18 12:47:11 2004