Skip to main content.
home | support | download

Back to List Archive

Re: OCR filter available?

From: Dean Pentcheff <dean2(at)not-real.biol.sc.edu>
Date: Wed Aug 13 2003 - 19:13:40 GMT
Another package to check out is ABBYY Software's FineReader.  As far as
I know it cannot be used as a filter, but would require processing each
of the PDF files "by hand" (I suspect, but don't know, that Omnipage
works the same way).  FineReader is much cheaper (US$150) and we've been
quite happy with its performance.

We've been OCRing digital photographs of old academic papers.  In our
case we're making PDFs of the images themselves, underlain with the
OCRed text.  The OCR is far from optimized (and uncorrected), but
provides enough accuracy to do useful indexing and searching within the
documents.  

We've just popped the index up, so if you'd like to take a peek, you can
look at the "site search" on:
	http://isopods.nhm.org

-Dean

On Wed, 2003-08-13 at 06:00, Carlos Rocha wrote:
> I am using Scansoft Omnipage 12 Office version to convert scanned images of 
> documents and make them available for searching, with quite good results. It 
> gives me about 95% accuracy.
> The software is quite expensive (list $499) but you can buy it a lot cheaper 
> on Ebay.
> 
> Carlos
> 
> >From: "Robert Keith" <Robert@Technolords.com>
> >Reply-To: Robert@Technolords.com
> >To: Multiple recipients of list <swish-e@sunsite.berkeley.edu>
> >Subject: [SWISH-E] OCR filter available?
> >Date: Tue, 12 Aug 2003 21:10:28 -0700 (PDT)
> >
> >Does anyone know of OCR software appropriate for swish-e filtering?  I have
> >a lot of PDF files that are scanned images of text documents; it would be
> >nice to search these documents.
> >
> >Robert Keith
> >
> 
> _________________________________________________________________
> MSN 8 helps eliminate e-mail viruses. Get 2 months FREE*.  
> http://join.msn.com/?page=features/virus
-- 
Dean Pentcheff <dean2@biol.sc.edu>
Received on Wed Aug 13 19:14:04 2003