Skip to main content.
home | support | download

Back to List Archive

Re: MetaNamesRank & exe build for Windows

From: Bill Moseley <moseley(at)not-real.hank.org>
Date: Tue Sep 09 2003 - 01:20:02 GMT
On Mon, Sep 08, 2003 at 04:42:48PM -0700, intervolved none wrote:
> I am trying to get the MetaNamesRank working.I installed the windows
> 2.4... version, added the line "MetaNamesRank 10 keywords" to my
> config file, reindexed the site, and looked to see if there was any
> change in the indexing.  There was not.  It is like it ignored the
> configuration file setting.  Am I supposed to make any more
> configuration changes for it to pick up the meta tag line from the
> html document?
> 
> I am trying to get the meta tags in my html page to give more weight
> than the actual text...  example : <meta name="keywords" content="test
> help work">

meta tags do have more weight:

moseley@bumby:~$ cat 1.html 2.html
<html>
<head><title>Title</title>
</head>
<body>
body
testword
</body>
</html>

<html>
<head><title>Title</title>
<meta name="foo" content="testword">
</head>
<body>
body
</body>
</html>

moseley@bumby:~$ cat c
Metanames foo

moseley@bumby:~$ swish-e -c c -i 1.html 2.html -v0

moseley@bumby:~$ swish-e -w testword or foo=testword -H0
1000 2.html "Title" 107
431 1.html "Title" 79
.

Now try with two metanames:

moseley@bumby:~$ cat 3.html
<html>
<head><title>Title</title>
<meta name="bar" content="testword">
</head>
<body>
body
</body>
</html>

See they have the same value here:

moseley@bumby:~$ swish-e -c c -i 2.html 3.html -v0
moseley@bumby:~$ swish-e -w bar=testword or foo=testword -H0
1000 3.html "Title" 107
1000 2.html "Title" 107
.

Now try changing the rank based on MetaNamesRank:

moseley@bumby:~$ cat c
Metanames foo bar
MetaNamesRank 10 foo

moseley@bumby:~$ swish-e -c c -i 2.html 3.html -v0

moseley@bumby:~$ swish-e -w bar=testword or foo=testword -H0
1000 2.html "Title" 107
592 3.html "Title" 107
.

> The meta tag line in the html should be weighted 10 times more than
> the words on the page, correct?

No, not really.  Sure is a lot easier when you can build from source.  


Here's the calcuation for each word:

    for(i = 0; i < freq; i++)
        rank += sw->structure_map[ GET_STRUCTURE(posdata[i]) ] + meta_bias;

Rank is the sum each word's rank, where each word's rank is the
meta_bias plus its "structure" value, which is based on its position. 
Then the log is taken of that number.

moseley@bumby:~$ swish-e -c c -i 2.html 3.html -T indexed_words -v0
    Adding:[1:swishdefault(1)]   'title'   Pos:2  Stuct:0x7 ( HEAD TITLE FILE )
    Adding:[1:foo(10)]   'testword'   Pos:5  Stuct:0x85 ( META HEAD FILE )
    Adding:[1:swishdefault(1)]   'body'   Pos:8  Stuct:0x9 ( BODY FILE )
    Adding:[2:swishdefault(1)]   'title'   Pos:2  Stuct:0x7 ( HEAD TITLE FILE )
    Adding:[2:bar(11)]   'testword'   Pos:5  Stuct:0x85 ( META HEAD FILE )
    Adding:[2:swishdefault(1)]   'body'   Pos:8  Stuct:0x9 ( BODY FILE )

So "testword" has a structure of 0x85 mening it's in a file (duh) and 
it's in the <head> section and it's also in a <meta> tag.  <head> is not 
used.

Then in config.h:

#define RANK_TITLE              7  // <title>
#define RANK_HEADER             5  // <h1> <h2>
#define RANK_META               3  // <meta> (or any <tag> for xml
#define RANK_COMMENTS           1  // <!-- comment -->
#define RANK_EMPHASIZED         0  // <em> <b> <strong> (or something like that)

So we have a <meta> so that's a value of three.  But one is added to 
that so all words have at least a point value of one for the structure.  
So it should be 3 + 1 = 4.

Again, the total rank is 1 + total of all the word positions.  In this 
case that's 1 + 4 = 5.

Let's try:


moseley@bumby:~$ (cd swish-e && make clean && ./configure CFLAGS='-DRAW_RANK -DDEBUG_RANK' && make) >/dev/null
moseley@bumby:~$ swish-e/src/swish-e -w bar=testword or foo=testword -H0
File num: 2.  Raw Rank: 5.  Frequency: 1 scaled rank: 16094
  Structure tally:
      struct 0x85 =  1 ( META HEAD FILE ) x rank map of 4 = 4

File num: 1.  Raw Rank: 15.  Frequency: 1 scaled rank: 27081
  Structure tally:
      struct 0x85 =  1 ( META HEAD FILE ) x rank map of 4 = 4

270 2.html "Title" 107
160 3.html "Title" 107

The scaled rank is * 100.

Clear as a bell, no? ;)




-- 
Bill Moseley
moseley@hank.org
Received on Tue Sep 9 01:20:14 2003