http://swish-e.org/docs/swish-faq.html#how_is_ranking_calculated_
Tito Sierra scribbled on 1/20/06 11:17 AM:
> Hello,
>
> I'm looking for some documentation describing the current ranking
> algorithm in use by either swish-e 2.4.X or 2.2.X. This need not be
> a full technical description, but a description of what factors
> influence document ranking.
>
> Is it accurate to say that SWISH-E employs a variant of "tf-idf" for
> ranking?
> http://en.wikipedia.org/wiki/Tf-idf
>
> From reading the mailing list archives I understand there is
> interest in improving the ranking algorithm. For my purposes SWISH-E
> works great as is. Very nice tool. My immediate interest is not in
> tweaking or improving the current algorithm, but in describing it to
> others. If someone has already gone through the trouble of writing
> this up I would love to cite them. Apologies if I have missed
> something in the mail archives or the swish-e.org website.
>
> Thanks,
> Tito
>
> P.S. I have read Josh Rabinowitz's very useful article "Indexing
> Arbitrary Data with SWISH-E." I'm hoping for something more
> descriptive than this:
>
> "The ranking algorithm used in swish-e does not bear easy
> explanation, but does take into account factors including the size of
> the documents, the frequency of each word in the document, and which
> tags the given text resides in."
--
Peter Karman . http://peknet.com/ . peter(at)not-real.peknet.com
Received on Fri Jan 20 09:36:02 2006