Skip to main content.
home | support | download

Back to List Archive

Re: Displaying Word Documents(.doc) metanames using

From: David L Norris <dave(at)>
Date: Mon May 23 2005 - 05:54:46 GMT
On Sun, 2005-05-22 at 21:33 -0700, Jono Tan wrote:
> Im not sure how to display .doc metadata from the swish-e index.

Well, first you'll have to get the metadata into a swish-e index.  You
could do so by writing a SWISH::Filter Perl module, for example.

> example if i want to search for a word document with author Joe Doe (This 
> should in turn search for metanames: author = Joe Doe).  I can get this 
> working with OpenOffice files as I can look @ the "meta.xml" file, but 
> cannot do this with the Word documents (filter catdoc).

As far as I'm aware catdoc does nothing more than concatenate text
objects within a Microsoft Word document.  I'm fairly sure there's no
way to extract metadata using catdoc.

You can do so with wv ( ):

$ wvSummary test_document.doc
The title is
The subject is
The author is David Norris
The keywords are
The comments are
The template was Normal
The last author was David Norris
The rev # was 7
no app name found
no pagecount
no wordcount
no charcount
no security
no codepage

 David Norris
  ICQ - 412039
Received on Sun May 22 22:54:47 2005