Skip to main content.
home | support | download

Back to List Archive

Re: Combining stem/non stem removing dups in perl

From: Peter Karman <karman(at)not-real.cray.com>
Date: Wed Nov 03 2004 - 22:41:17 GMT
I do something similar and found that your approach (hash keys) was the 
simplest. I use a counter so that I know when I've hit the appropriate 
number of 'hits' based on the initial range I was looking for.

e.g. (CODE UNTESTED AND UNVERIFIED)

# create the query
# create api object
# search

my $hitsIwant = 20;
my %uniq;

while (my $result = $swish->NextResult)
{

	my $prop = $result->Property( 'key' );
	$uniq{$prop}++;
	last if scalar( keys %uniq ) == $hitsIwant;

}

Brad Miele wrote on 11/3/04 3:33 PM:

> Hi,
> 
> I am trying to run a query against two indexes that contain the same set
> of records. The first index is indexed without stemming, the next is. The
> goal was to have exact matches come up at the start of the results and
> then the stemmed ones towards the end. i am sorting the stemmed records to
> the end of the list using a value added during the indexing.
> 
> The problem is that since the indexes contain the same records, i want to
> remove the duplicates from the results. I can do it using hash keys during
> the ->NextResults phase of the search, but it slows things down and then
> the number returned by Hits is off.
> 
> Has anyone implimented something along these lines?
> 
> Brad
> ------------------------------------------------------------
>  Brad Miele
>  Technology Director
>  AuroraPhotos.com
>  (207) 828-8787 x110
>  bmiele@auroraphotos.com
> 
>  panic: can't find /
> 

-- 
Peter Karman  .  http://www.cray.com/craydoc/ .  karman(at)not-real.cray.com
Received on Wed Nov 3 14:41:18 2004