Re: indexing PDF files by page?

From: Bill Moseley <moseley(at)>
Date: Wed Dec 17 2003 - 16:53:40 GMT
On Wed, Dec 17, 2003 at 11:22:52AM -0500, Mike Scarborough wrote:

> >Since the pdf to text conversion often includes the ^L to separate
> >pages, yes it's not hard to index individual pages.
> >
> thank you for your quick reply.  I'm sorry, but what is the ^L?  

It's the form feed character.  xpdf is set by default to generate those.  
>From the xpdfrc man page:

       textPageBreaks yes | no
              If set to "yes", text extraction will insert page breaks (form feed
              characters) between pages.  This defaults to "yes".

Bill Moseley
