On Wed, Oct 22, 2003 at 11:39:04AM -0700, Bruce Pettyjohn wrote:
> The problem comes when trying to start with an html index and crawl through
> the document
> list containing the Excel files with this command:
>
> /usr/local/lib/swish-e/spider.pl default
> http://www.varianinc.com/epindex.htm
>
> All of the docs are found. Only the Word docs are filtered.
>
> Again this works for the individual file:
>
> /usr/local/lib/swish-e/spider.pl default
> http://www.varianinc.com/test.xls
Well, then that's a bug. Actually, there were two bugs -- I had moved a Makefile
and the filter was getting installed in the wrong location.
I'm sure glad you caught that. I wrote my test in the "right" order so
that didn't show up.
Sorry for the trouble. And that the xls filter sure is slow!
I just created a new daily snapshot with the fix:
http://swish-e.org/dev/swish-daily/swish-e-2.4.0-pr4-2003-10-22.tar.gz
But the fix is not hard if you don't want to reinstall.
in the convert subroutine in Filter.pm:
+ my $done;
for my $filter ( @filter_set ) {
+ if ( $done ) {
+ push @cur_filters, $filter;
+ next;
+ }
+
# All done?
- last unless $doc_object->continue( 0 );
+ $done++ unless $doc_object->continue( 0 );
}
--
Bill Moseley
moseley@hank.org
Received on Wed Oct 22 19:54:38 2003