Skip to main content.
home | support | download

Back to List Archive

Re: StoreDescription for XML, indexing Powerpoint

From: Alex Lyons <Alex.Lyons(at)not-real.sercoassurance.com>
Date: Fri Jan 24 2003 - 09:54:33 GMT
Andrew,

Try "ppthtml", that comes with "xlhtml".

Works most of the time, but sometimes returns unexpectedly blank files - not sure if it can handle all of the very latest ppt formats.  But, overall, considerably better than "strings".

Alex.

-------------------------------------------------------------------
  This e-mail and any attachments may contain confidential and/or
  privileged material; it is for the intended addressee(s) only.
  If you are not a named addressee, you must not use, retain or
  disclose such information.
  Serco cannot guarantee that the e-mail or any attachments are
  free from viruses.
  Serco Group plc. Registered in England and Wales. No: 2048608
  Registered Office: Dolphin House, Windmill Road,
  Sunbury-on-Thames TW16 7HT, United Kingdom.
-------------------------------------------------------------------

>>> Andrew Smith <asmith@compbio.berkeley.edu> 23/01/03 23:03:28 >>>

Finally, this has probably been asked, but is there a Linux filter to use 
for filtering and indexing MS Powerpoint files (i.e. something like 
pdftotext for pdf)? I haven't been able to find a good free one, and was 
thinking of just using the "strings" command to extract printable strings 
from a file, but just want to know if there is anything better.

thanks,

Andrew Smith
Received on Fri Jan 24 09:54:58 2003