Skip to main content.
home | support | download

Back to List Archive

Re: SWISH-E ranking algorithm - description?

From: Peter Karman <peter(at)>
Date: Fri Jan 20 2006 - 17:35:57 GMT

Tito Sierra scribbled on 1/20/06 11:17 AM:

> Hello,
> I'm looking for some documentation describing the current ranking  
> algorithm in use by either swish-e 2.4.X or 2.2.X.  This need not be  
> a full technical description, but a description of what factors  
> influence document ranking.
> Is it accurate to say that SWISH-E employs a variant of "tf-idf" for  
> ranking?
>  From reading the mailing list archives I understand there is  
> interest in improving the ranking algorithm.  For my purposes SWISH-E  
> works great as is.  Very nice tool.  My immediate interest is not in  
> tweaking or improving the current algorithm, but in describing it to  
> others.  If someone has already gone through the trouble of writing  
> this up I would love to cite them.  Apologies if I have missed  
> something in the mail archives or the website.
> Thanks,
> Tito
> P.S. I have read Josh Rabinowitz's very useful article "Indexing  
> Arbitrary Data with SWISH-E."  I'm hoping for something more  
> descriptive than this:
> "The ranking algorithm used in swish-e does not bear easy  
> explanation, but does take into account factors including the size of  
> the documents, the frequency of each word in the document, and which  
> tags the given text resides in."

Peter Karman  .  .  peter(at)
Received on Fri Jan 20 09:36:02 2006