Skip to main content.
home | support | download

Back to List Archive

Relevance ranking

From: Bill Moseley <moseley(at)not-real.hank.org>
Date: Fri Aug 13 1999 - 18:11:37 GMT
Hello,

I'm looking for information on how swish-e 1.3.x computes the relevance
ranking scores.

>From the docs:

"The relevance rank. 
... depends on a number of factors, such as how many times your search word
appears in the file, how many words are in the file, and if the word
appears in a title or header tag (if it's an HTML file), among other factors."

Specifically:

- what is exactly meant by "header tag" above?  Anything within the
  <HEAD></HEAD> block?  Does this mean a <TITLE> placed in a <HEAD> block
  will ranked higher than a <TITLE> outside a <HEAD> block?

- are keywords found in <META> tags used in computing the score?  And does
  it matter where the <META> tags are located (in <HEAD>, <BODY>, or
  outside either)?

- does other HTML formatting effect the scoring?

- what are those "other factors" mentioned above?


BTW -- the indexed documents are not HTML files read by a browser, which is
why I'm describing invalid HTML.

Thanks,

Bill Moseley
mailto:moseley@hank.org
Received on Fri Aug 13 11:13:49 1999