Skip to main content.
home | support | download

Back to List Archive

Re: Silly Logic problem w. Meta Tags

From: Bill Moseley <moseley(at)not-real.hank.org>
Date: Fri Aug 16 2002 - 18:05:43 GMT
At 09:58 AM 08/16/02 -0700, David VanHook wrote:
>OK, I fiddled around with the meta tag names, but that appeared not to make
>any difference.  Still not working on multiple Meta tags. But it does sound
>like maybe using the new 2.1dev is a good idea, anyway -- should I just pick
>the latest daily snapshot?

Yes.  I actually use cvs every place I use swish -- makes it really easy to
get updates or to fetch a given version.


>Once we've got it installed and config'd, we'll be using this on a website
>which gets about 1 million hits per week, and about 3,000 searches a day.
>Is the dev version ready for that kind of spotlight?

Two a minute?  I sure hope so.  

You can either fork every request to swish, or use the swish library and
embed swish into your application.  Here's a quick benchmark of an index
with 50,000 files searching for "we or you or them":

Benchmark: timing 3000 iterations of fork_swish, library_swish...

fork_swish: 148 wallclock secs 
  ( 4.44 usr  1.43 sys + 111.02 cusr 31.07 csys = 147.96 CPU) 
  @ 511.07/s (n=3000)

library_swish: 62 wallclock secs 
  (37.51 usr + 24.53 sys = 62.04 CPU) @ 48.36/s (n=3000)

Fork: 3000 requests, 2067000 total results
Library: 3000 requests, 2067000 total results

Even faster for a simple single keyword search "hello"

library_swish: 26 wallclock secs
  (13.82 usr + 11.76 sys = 25.58 CPU) @ 117.28/s (n=3000)

Library: 3000 requests, 765000 total results


>Or will there be a new
>stable release sometime soon which perhaps we should wait for?

You should see the pile on my desk.  I wouldn't wait.

Disclaimer:  Benchmarks are never right.  Here's the code:

moseley@bumby:~/swish-e/src$ cat bench.pl
#!/usr/local/bin/perl -w
use strict;
use SWISHE;
use Symbol;
use Benchmark;

    my $fork_count_recs = 0;
    my $fork_count = 0;
    my $lib_count_recs = 0;
    my $lib_count = 0;

    my $query = 'food not hello';

    my $handle = SwishOpen( 'index.swish-e' )
    or die "Failed to open index";

    timethese( 3000, {
               'library_swish' => \&library_swish,
               'fork_swish' => \&fork_swish,
           });

print "Fork: $fork_count requests, $fork_count_recs total results\n";
print "Library: $lib_count requests, $lib_count_recs total results\n";
    
sub library_swish {

    my $num_results = SwishSearch($handle, $query, 1, '','' );
    if ( $num_results <= 0 ) {
        print ($num_results ? SwishErrorString( $num_results ) : 'No
Results');

        my $error = SwishError( $handle );
        print "\nError number: $error\n" if $error;
        return;  # or next.
    }    

    my @recs;
    while ( (my @rec = SwishNext( $handle )) ) {
        push @recs, \@rec;
    }
    $lib_count_recs += @recs;
    $lib_count++;
}
    

sub fork_swish {
    my $swish = gensym;
    my $pid = open( $swish, '-|' );
    die "failed to fork" unless defined $pid;

    if ( !$pid ) {
        exec( "./swish-e", '-w', $query, '-H0' )
            or die "failed to exec";
    }

    my @recs = <$swish>;
    $fork_count_recs += @recs;
    $fork_count++;
}
    

-- 
Bill Moseley
mailto:moseley@hank.org
Received on Fri Aug 16 18:09:13 2002