Skip to main content.
home | support | download

Back to List Archive

Re: Tuning ranking manually with MetaNamesRank

From: koszalekopalek <koszalekopalek(at)not-real.interia.pl>
Date: Thu Jun 09 2005 - 09:10:26 GMT
Peter Karman wrote:
> if you're doing more testing, I'd suggest this:
> 
> 1. set DEBUG_RANK as Bill suggests. As of 2.4.3, that's in rank.h
> 
> 2. adjust the default bias range from 10 up to 100 (I think 10 was a random 
> number to start with). You can set it in swish.h.

Yes, it's hard-coded in the source. I cannot create an index when it is 
set to more than 10:

	err: 'MetaNamesRank' value of '100' is not an integer between -10 and 10.


I guess this the define in swish.h:

	#define RANK_BIAS_RANGE 10 /* max/min range ( -10 -> 10, with zero 
being no bias ) */

Before re-compiling the source (I'll do it as soon as I have a bit more 
time) I 'spammed' meta tags.  This is easy for me, because meta tags are 
inserted automatically by filter_content callback routine in spider.pl 
based on data read from a text file.

The text file is is actually a perl array with alternating keyword, url 
strings. I read the keywords into a hash indexed by url. filter_content 
checks if the hash key exists for the current url, if so an appropriate 
metatag is inserted.

To 'spam' the tags I 'multiplied' the strings by x99. I assume this has 
an effect similar to setting the bias, right?

tuningcfg = (
'foo ' x 99,  'http://localhost/a.htm',
'bar ' x 99,  'http://localhost/b.htm',
);

I have not debugged the actual ranks but so far I have not seen any 
changes in the result order. (Maybe there is some stupid mistake in my 
code/cfg file? -- I'm double checking.)

Anyway -- I think want I am doing is becoming a hack on top of a hack. 
Let's change it into a feature request:-)

The whole point is that I think it is useful to be able to manually 
assign urls to selected keywords. (Remember that Google demo I mentioned 
in my first post?)  The keyword/url pairs could be read from a plain 
text file. The location of that file could be specified in the 
configuration hash for spider.pl. This is easy. Now, once I index an URL 
and I know that some 'keywords' are assigned to it, how do I tweak the 
ranking? I thought that automatically inserted meta tags were a good 
idea but maybe there is a better way?

Adam


> since you've read the FAQ, you'll know that this feature is highly experimental. 
> I would try it with both RankSchemes as well, to see if that makes a difference. 
> The bias is a bit of hack, since it just artificially increases the frequency of 
> the word(s), and frequency has more/less effect depending on which RankScheme 
> you use.

----------------------------------------------------------------------
Startuj z INTERIA.PL! >>> http://link.interia.pl/f186c 
Received on Thu Jun 9 02:10:34 2005