Skip to main content.
home | support | download

Back to List Archive

Re: pdf2xml problem while indexing pdf files

From: Bill Moseley <moseley(at)>
Date: Wed Nov 12 2003 - 21:31:28 GMT
On Wed, Nov 12, 2003 at 12:12:12PM -0800, wrote:

> #!/usr/bin/perl -w
> use pdf2xml;
> my @files =
> system ('find /var/www/html/ccsp/docs/ -name *.pdf -print');
> # system ('find /var/www/html/ccsp/docs/ -name *.pdf >
> /var/www/html/ccsp/docs/results.file');

Does that work?  system() returns the exit status of the process.
In perl you use backticks to capture output, but for this I'd use the
standard File::Find module.  An example is in the file
located in the prog-bin directory of the distribution.

Then the module runs pdftotext and pdfinfo.  Are those in
your path?

> I and my tech support cannot figure out what "..file 65280.." is.  There is
> no such filename anywhere on the server and it is not a PDF file in our test
> directory (../ccsp/docs/). We are at a loss as to what to do next.

It's probably the return code from the system call.

Bill Moseley
Received on Wed Nov 12 21:31:51 2003