Skip to main content.
home | support | download

Back to List Archive

Re: Catdoc - following on from the Open2 problems i

From: Ahmad, Zeeshan (FMC) <Zeeshan.Ahmad(at)not-real.fmc.sa.gov.au>
Date: Wed Mar 24 2004 - 02:21:23 GMT
Hi David

I am trying to index word documents on windows using swish-e 2.4. Although
some documents get indexed, for others Dr. Watson reports an access
violation in catdoc.exe. 

Is this wvWare feature ready for production use on Windows by any chance?

__________________
 
Zeeshan Ahmad
FMC Computing Services
Bedford Park SA
 
Ph: 8204 6178
 

-----Original Message-----
From: swish-e@sunsite.berkeley.edu [mailto:swish-e@sunsite.berkeley.edu] On
Behalf Of David L Norris
Sent: Friday, 23 January 2004 4:00 PM
To: Multiple recipients of list
Subject: [SWISH-E] Re: Catdoc - following on from the Open2 problems in Win
2000

On Thu, 2004-01-22 at 23:59, Allan_Watts@amp.com.au wrote:
> I guess I just battle on.(I am getting around the long filenames by copyi=
ng
> the file somewhere else first, and I  have a list of files, now, that I
> will ignore...)  Any suggestions appreciated..  (eg a way to trap errors
> from catdoc).

Earlier today I created a task to remind myself to replace or supplement
catdoc with wvWare before the next major release (i.e. maybe not soon).=20
This code is the basis for Abiword (and OpenOffice.Org?) Word import
plugins.  It works fairly well but I think it doesn't support document
formats older than Word 97.

wvHtml will convert to HTML.  So its better than catdoc in that respect.


The GNU Win32 project maintains the Windows port of wvWare:
http://sourceforge.net/project/showfiles.php?group_id=3D23617&package_id=3D=
21335&release_id=3D211578

I'm not sure which of those files you need.  Probably .exe and maybe the
bin.zip?  I've done no testing under Windows at all.  But it works great
under Linux.

The command line to convert doc to HTML would be something like this:
  "wvHtml filename.doc -"

--=20
 David Norris
  http://www.webaugur.com/dave/
  ICQ - 412039



*********************************************************************
Due to deletion of content types excluded from this list by policy,
this multipart message was reduced to a single part, and from there
to a plain text message.
*********************************************************************
Received on Tue Mar 23 18:21:24 2004