Re: Catdoc - following on from the Open2 problems i

Date: Sun Mar 28 2004 - 22:21:54 GMT
I had most success (under Win 2000) with the "orginal" version of catdoc ( and using perl

my $shortname = Win32::GetShortPathName($filename);

to get around the long file name problem.  Even then, there were a small
number of MS-Word documents that caused catdoc to hang , with "Bad BBD
entry!" error message.  (I excluded these files explicitly just to get
something working.)

Can't remember now what the exact problems were with wvWare (just as likely
to be me!).

(Apologies for taking so long to get to this.)


On Wed, 2004-03-24 at 02:20, Ahmad, Zeeshan (FMC) wrote:
> I am trying to index word documents on windows using swish-e 2.4. Althoug
> some documents get indexed, for others Dr. Watson reports an access
> violation in catdoc.exe.=20

That's not surprising.  The version included with SWISH-E is an
unofficial port I made just to support long filenames.  wvWare is
probably the better choice since it's actively supported on Windows by
the upstream maintainers.  Also, wvWare is used by Abiword as the Word
document importer.  catdoc's maintainer indicated he has no interest in
supporting Windows.

> Is this wvWare feature ready for production use on Windows by any chance?

Well, all I can say is to test it with your documents.  I don't use
Windows at all beyond testing.  In my tests, wvWare works better than
catdoc and correctly converted all of the documents I had available.=20
catdoc failed to decode various text objects (links, etc) in numerous
documents (on Windows and UNIX).  I've not seen catdoc crash however it
seems likely it could crash when reading documents with unexpected OLE
objects or markup.  And my catdoc port may crash for any number of other

