Skip to main content.
home | support | download

Back to List Archive

Re: Indexing takes forever

From: David L Norris <dave(at)>
Date: Sun May 08 2005 - 15:47:16 GMT
On Fri, 2005-05-06 at 14:13 -0700, Nick wrote:
> 22865 Warning - /home/shared/Accounting/Capital/Update Capital 7-7-04.xls:
> Character in 'c' format wrapped in pack at
> /usr/lib/perl5/vendor_perl/5.8.6/Spreadsheet/ line 1790.

Not sure what that means but it doesn't seem to be very important.

> Error: Bad annotation action

xpdf says some unnamed PDF file has bad annotation.  xpdf has a bad
habit of not telling you which file it is processing.

> Failed to set content type for document
> '/home/shared/Environmental_Community/Environmental/Awards/Independence
> Examiner Playground Article 12-13-02.mht'

SWISH::Filter doesn't know what type of file that is.

I think those are funky HTML files that have binary blobs encoded into
them for the images and such.  It's been a long time since I've seen
one, though.  Internet Explorer offers several odd file formats when you
click "Save As" on a web page.

> Bad BBD entry!
> Broken OLE file. Try using -b switchFailed to set content type for
> document
> '/home/shared/Environmental_Community/Environmental/Training/Thumbs.db'

catdoc doesn't know how to decode Windows thumbnail catalogs.  Which
isn't surprising since thumbnails aren't Word documents.  The thumbnail
catalog thingies are likely OLE documents and your system's MIME
database must be identifying them as a Word document.

> Do those matter?

On the whole, probably not at all.

> Also does the default SWISH::Filter install know about powerpoint files
> too?  I looked in /usr/lib/swish-e/perl/SWISH/Filters but I only see files
> that seem to reference ms word, ms excel, pdf, and mp3.  I see that ms
> powerpoint is advertised on your web page as being supported, but there
> doesn't seem to be much mention of it.

There should be a SWISH/Filters/ module.  It's in CVS but
maybe it hasn't made it into a released version of Swish-e.  It requires
the ppthtml program to translate the Power Point documents to HTML.

 David Norris
  ICQ - 412039
Received on Sun May 8 08:47:21 2005