Skip to main content.
home | support | download

Back to List Archive

Re: API performance

From: Ahmad, Zeeshan (FMC) <Zeeshan.Ahmad(at)not-real.fmc.sa.gov.au>
Date: Wed Jan 21 2004 - 06:23:36 GMT
These stats are displayed by swish script itself (Run time and Search time)
- it was not an attempt at benchmarking.

__________________
 
Zeeshan Ahmad
FMC Computing Services
Bedford Park SA
 
Ph: 8204 6178
 

-----Original Message-----
From: swish-e@sunsite.berkeley.edu [mailto:swish-e@sunsite.berkeley.edu] On
Behalf Of Bill Moseley
Sent: Wednesday, 21 January 2004 3:26 PM
To: Multiple recipients of list
Subject: [SWISH-E] Re: API performance

On Tue, Jan 20, 2004 at 05:59:52PM -0800, Ahmad, Zeeshan (FMC) wrote:
> I have noticed considerable performance difference when using SWISH::API
> 
> Using API run time is 4.5 seconds
> Using the binary run time is 0.5 seconds!

Not sure how to respond to a statment like that.  Well, I know, I'll
benchmark and provide some supporting code that others can try.
Code and output is below, but to summarize (and forking on Linux is
FAST!)

Benchmark: timing 1000 iterations of Using Binary, Using SWISH::API...
Using Binary: 65 wallclock secs ( 0.44 usr  0.57 sys + 55.38 cusr  7.29 csys
= 63.68 CPU) @ 990.10/s (n=1000)
Using SWISH::API: 10 wallclock secs ( 9.45 usr +  0.58 sys = 10.03 CPU) @
99.70/s (n=1000)

That's searching the swish-e list archive with 5743 results in the
search and loading an array of twenty results.

Hell of a lot easier to code with SWISH::API, too.

I would suggest that something else is causing your slow down.  If you
make a statement like that please provide something that demonstrates
and supports what you are saying -- like a bit of code that other's can
test.


> In both cases the index file is same. I am using 2.4.0 on Win 2K.
> 
> Also worth noting is the fact that while using swish-e binary fails when
the
> script is run via PerlIIS, it works when using API instead. When using the
> binary I get the following error:
> 
> -----
> open2: Can't call method "close" on an undefined value at
> D:/Perl/lib/IPC/Open3.pm line 338

Sorry, I can't help you there.  If you want to resolve this I would
suggest creating a small, no tiny, script that will show the problem and
try PerlMonks as there seems to be some Windows people there.

I can't look at that line either -- I seem to not have enough lines in
my versions:

    $ wc -l /usr/share/perl/5.8.2/IPC/Open?.pm
      39 /usr/share/perl/5.8.2/IPC/Open2.pm
     268 /usr/share/perl/5.8.2/IPC/Open3.pm
     307 total


> 
> script line is:
> my $pid = IPC::Open2::open2($rdrfh, $wtrfh, @command );
> -----
> 
> Looking at some other posts windows forking seems to be problematic. Is it
> due to the fact that when running via PerlIIS, script tries to fork but
> fails because each forked process needs a unique id and all children
created
> would get IIS id? Just a wild guess...

Did you print $pid?  Fork is emulated in threads under Windows, and IIRC
the pid is returned like a unique pid.  But check it and see.  I'm not
sure what PerlIIS is or does, but if the fork is failing I'll bet it's
because of your environment, not something about pids.

Here's the benchmark program.

#!/usr/bin/perl -w
use strict;
use Benchmark;
 
unshift @INC, `swish-filter-test --path`;
use SWISH::API;
 
my $swish = SWISH::API->new('index.swish-e');
$swish->AbortLastError if $swish->Error;
my $search = $swish->New_Search_Object;
 
sub api {
    my $results = $search->Execute('swish');
    $results->SeekResult(49);
    my @results;
    push @results, $results->NextResult for 1..20;
    return @results;
}


sub binary {
    my $pid = open(SWISH, '-|' );
    die "falied to for" unless defined $pid;

    exec qw/ swish-e -w swish -b50 -m20 -H0 -x /,
"<swishdocpath>\t<swishtitle>\n"
        unless $pid;
 
    my @results;
    while ( <SWISH> ) {
        chomp;
        my %properties;
        @properties{ qw/ swishdocpath swishtitle /} = split /\t/;
        push @results, \%properties;
    }

    return @results;
}

my @results = api();
print( join( ": ", $_->Property('swishdocpath'),
$_->Property('swishtitle')),"\n" )
    for @results;

print '-'x50, "\n";

@results = binary();
print "$_->{swishdocpath}: $_->{swishtitle}\n" for @results;

print '-'x50, "\n";

timethese( 1000, {
    'Using SWISH::API'  => \&api,
    'Using Binary'      => \&binary,
});

$ perl bench.pl 
./archive/1999-01/0757.html: RE:  SWISH++ list (was: SWISH-E Posting Policy)
./archive/1997-11/0073.html: problems with temporary files (etc.) using
swish-e -M
./archive/2003-01/5066.html: Problems with SWISH-Stemmer on Windows
./archive/2003-03/5307.html: Re: modifying swish.cgi output
./archive/1999-02/0827.html: swish-e index corrupted ?
./archive/1998-07/0391.html: Re: [SWISH-E:398] Re: SWISH++ 1.2.1 released
./archive/2002-04/3907.html: What is Swish-e?
./archive/1998-11/0592.html: Re:  Re: SWISH++ 1.3 released
./archive/1998-11/0588.html: Re:  RE: Combining SWISH++ and Swish-E
./archive/2003-09/6198.html: FW: Re: Filtering problems
./archive/2003-09/6182.html: Re: swish.cgi results no path in title
./archive/2003-06/5704.html: Re: configuring and debugging swish.cgi with
IIS
./archive/2001-08/3062.html: Re: Indexing of word documents, stored on a
UNIX
./archive/1998-11/0582.html: RE: Combining SWISH++ and Swish-E
./archive/2003-02/5193.html: Re: swishex setup
./archive/2002-11/4903.html: RE: how to get a description
./archive/1998-01/0106.html: Re: [SWISH-E:109] who stole swish_create.pl?
./archive/2003-06/5728.html: Patch
./archive/2003-11/6400.html: Re: Fixing Swish 2.4.0 To Work on Windows
./archive/2003-05/5526.html: Re: swish-e
--------------------------------------------------
./archive/1999-01/0757.html: RE:  SWISH++ list (was: SWISH-E Posting Policy)
./archive/1997-11/0073.html: problems with temporary files (etc.) using
swish-e -M
./archive/2003-01/5066.html: Problems with SWISH-Stemmer on Windows
./archive/2003-03/5307.html: Re: modifying swish.cgi output
./archive/1999-02/0827.html: swish-e index corrupted ?
./archive/1998-07/0391.html: Re: [SWISH-E:398] Re: SWISH++ 1.2.1 released
./archive/2002-04/3907.html: What is Swish-e?
./archive/1998-11/0592.html: Re:  Re: SWISH++ 1.3 released
./archive/1998-11/0588.html: Re:  RE: Combining SWISH++ and Swish-E
./archive/2003-09/6198.html: FW: Re: Filtering problems
./archive/2003-09/6182.html: Re: swish.cgi results no path in title
./archive/2003-06/5704.html: Re: configuring and debugging swish.cgi with
IIS
./archive/2001-08/3062.html: Re: Indexing of word documents, stored on a
UNIX
./archive/1998-11/0582.html: RE: Combining SWISH++ and Swish-E
./archive/2003-02/5193.html: Re: swishex setup
./archive/2002-11/4903.html: RE: how to get a description
./archive/1998-01/0106.html: Re: [SWISH-E:109] who stole swish_create.pl?
./archive/2003-06/5728.html: Patch
./archive/2003-11/6400.html: Re: Fixing Swish 2.4.0 To Work on Windows
./archive/2003-05/5526.html: Re: swish-e
--------------------------------------------------
Benchmark: timing 1000 iterations of Using Binary, Using SWISH::API...
Using Binary: 65 wallclock secs ( 0.58 usr  0.65 sys + 55.69 cusr  6.74 csys
= 63.66 CPU) @ 813.01/s (n=1000)
Using SWISH::API: 10 wallclock secs ( 9.38 usr +  0.84 sys = 10.22 CPU) @
97.85/s (n=1000)



-- 
Bill Moseley
moseley@hank.org
Received on Wed Jan 21 06:24:48 2004