On Fri, Aug 19, 2005 at 04:53:27AM -0700, Benoit Guguin wrote:
> Ok thank you,
>
> I Have tested with Dirtree.pl and it's works fine with xls, pdf and doc.
>
> So I'm currently looking to add filter for powerpoint and openoffice
> (sxi, sxw, sxc). But I don't understand the source code :( ...
>
> If someone already do this, can he give us the file please ?
You just copy one of the exiting filters in
$srcdir/filters/SWISH/Filters/. I see there's already a pp2html.pm
filter that requires the ppthtml program:
perldoc pp2html.pm
pp2html(3) User Contributed Perl Documentation pp2html(3)
NAME
SWISH::Filters::pp2html - Perl extension for filtering MS PowerPoint docu-
ments with Swish-e
DESCRIPTION
This is a plug-in module that uses the xlhtml package to convert MS Power-
Point documents to html for indexing by Swish-e.
This filter plug-in requires the xlhtml package which includes ppthtml
available at:
http://chicago.sourceforge.net/xlhtml
Currently produces document titles like /tmp/foo1234. Need to alter to
pass actual document title.
AUTHOR
Randy Thomas
SEE ALSO
SWISH::Filter
Check the archives -- I thought someone posted initial work on an
Openoffice filter.
>
>
> Thanks again,
>
> Regards,
>
> Peter Karman a écrit :
>
> >The .pm files:
> >
> > doc2txt.pm
> > pdf2html.pm
> > pdf2xml.pm
> >
> >are example modules that predate (iirc) the SWISH::Filters class. The reason
> >pdf2html works in your script is this line in the pdf2html.pm file:
> >
> > @EXPORT = qw(pdf2html);
> >
> >which tells Perl to make that function available in your script's namespace with
> >the 'use' function.
> >
> >I'd suggest using the DirTree.pl example script instead; it calls SWISH::Filter
> >for you correctly.
> >
> >Benoit Guguin scribbled on 8/19/05 4:45 AM:
> >
> >
> >
> >>Hello,
> >>
> >>I try to index a directory with only pdf, doc, xls and ppt.
> >>
> >>
> >>I've seen in version 2.5.4 some perl script to filter .ppt, .xls and .doc.
> >>
> >>I try to use them with the prog method but when I run swish-e (
> >>"swish-e -c /etc/swish-e/swish.conf -S prog") I have thoses erros :
> >>
> >>Undefined subroutine &main::Doc2html called at /etc/swish-e/swish.pl
> >>line 55.
> >>Or
> >>Undefined subroutine &main::pp2hml called at /etc/swish-e/swish.pl
> >>
> >>The error depends of the order of the functions.
> >>
> >>
> >>So I don't undestand why it's work fine for pdf but not for others
> >>format...
> >>
> >>I'm looking around ml archive but dont find my St Graal;)
> >>
> >>Any idea please ?
> >>
> >>Regards,
> >>
> >>
> >>
> >>
> >
> >
> >
>
>
> --
> Guguin Benoit
> Société Alixen 2 rue Jean Rostand 91 893 Orsay Cedex France
> Tel : 01 69 85 24 13, Fax : 01 69 85 24 10
>
>
--
Bill Moseley
moseley@hank.org
Unsubscribe from or help with the swish-e list:
http://swish-e.org/Discussion/
Help with Swish-e:
http://swish-e.org/current/docs
swish-e@sunsite.berkeley.edu
Received on Fri Aug 19 07:07:25 2005