I posted a question before that got no reply...
My understanding based on results is that swish-e does not discriminate
between words. Word frequency in a document is used to compute rank, but the
word's frquency in the overall document set is not considered. I just
remember being taught that the weight of a word in the rank should be
inversly proportional to the number of documents it appears in. This would
cause the word 'the' to be of less weight than the word 'democracy', even if
(in most document sets) 'the' appears in the title and 'democracy' only in
the body.
Was disciminating among terms considered for swish-e and considered to be
too much additional work, or was it not included cuz it's a bad idea?
Or did did the issue never come up?
It seems it would give more relevant results.
dave
_________________________________________________________________
The new MSN 8: smart spam protection and 2 months FREE*
http://join.msn.com/?page=features/junkmail
Received on Mon Mar 8 18:13:25 2004