Skip to main content.
home | support | download

Back to List Archive

Re: [swish-e] Cant index ppt and xls shows no title

From: at <Peter>
Date: Mon, 28 Nov 2011 21:37:57 -0600
ChiFli wrote on 11/24/11 8:18 AM:
> IAfter some more debugging i found that for some reason catppt was not being use because i did a swish-filter-test -verbose and it cant found ppthtml program.
> I installed a version off ppt2html compatible with centos 64 bit and now the program is found but i dont get the actual content of the ppt as you can see:
> Final Content type for https://xxxxxxx/wp-content/uploads/2011/11/testppt.ppt is text/html
>   >Filter SWISH::Filters::pp2html=HASH(0x7bb6638) converted from [application/] to [text/html]
> Document https://xxxxxx/wp-content/uploads/2011/11/testppt.ppt was  filtered.
>    Document:     https://xxxxxx/wp-content/uploads/2011/11/testppt.ppt  (https://xxxxxxxx/wp-content/uploads/2011/11/testppt.ppt)
>    Content-Type: text/html
>    Parser type:  HTML*
>    >Filter used: SWISH::Filters::pp2html=HASH(0x7bb6638) ( application/ -> text/html )
> -- Output Content Sample --
> <HTML><HEAD><title>0LgeOiXUlB</title></HEAD><BODY>
> <HR>&nbsp;<br>
> <hr><FONT SIZE=-1>Created with <a href="">pptHtml</a></FONT><br>
> </BODY></HTML>
> ------
> Dont know why…. :(

There are 2 Powerpoint converters supported in SWISH::Filter.

One is the catppt part of catdoc. That is used by SWISH::Filters::ppt2txt.

Two is the ppthtml part of xlhtml. That is used by SWISH::Filters::pp2html.

Looks like your test was using the second. I expect that the xlhtml package is
just very old: claims it was last updated
in 2002. I expect your .ppt was created with a version newer than that.

I would try the catppt version; just make sure it is in your PATH.

Peter Karman  .  .  peter(at)
Users mailing list
Received on Tue Nov 29 2011 - 03:38:00 GMT