On Mon, Sep 08, 2003 at 04:42:48PM -0700, intervolved none wrote:
> I am trying to get the MetaNamesRank working.I installed the windows
> 2.4... version, added the line "MetaNamesRank 10 keywords" to my
> config file, reindexed the site, and looked to see if there was any
> change in the indexing. There was not. It is like it ignored the
> configuration file setting. Am I supposed to make any more
> configuration changes for it to pick up the meta tag line from the
> html document?
>
> I am trying to get the meta tags in my html page to give more weight
> than the actual text... example : <meta name="keywords" content="test
> help work">
meta tags do have more weight:
moseley@bumby:~$ cat 1.html 2.html
<html>
<head><title>Title</title>
</head>
<body>
body
testword
</body>
</html>
<html>
<head><title>Title</title>
<meta name="foo" content="testword">
</head>
<body>
body
</body>
</html>
moseley@bumby:~$ cat c
Metanames foo
moseley@bumby:~$ swish-e -c c -i 1.html 2.html -v0
moseley@bumby:~$ swish-e -w testword or foo=testword -H0
1000 2.html "Title" 107
431 1.html "Title" 79
.
Now try with two metanames:
moseley@bumby:~$ cat 3.html
<html>
<head><title>Title</title>
<meta name="bar" content="testword">
</head>
<body>
body
</body>
</html>
See they have the same value here:
moseley@bumby:~$ swish-e -c c -i 2.html 3.html -v0
moseley@bumby:~$ swish-e -w bar=testword or foo=testword -H0
1000 3.html "Title" 107
1000 2.html "Title" 107
.
Now try changing the rank based on MetaNamesRank:
moseley@bumby:~$ cat c
Metanames foo bar
MetaNamesRank 10 foo
moseley@bumby:~$ swish-e -c c -i 2.html 3.html -v0
moseley@bumby:~$ swish-e -w bar=testword or foo=testword -H0
1000 2.html "Title" 107
592 3.html "Title" 107
.
> The meta tag line in the html should be weighted 10 times more than
> the words on the page, correct?
No, not really. Sure is a lot easier when you can build from source.
Here's the calcuation for each word:
for(i = 0; i < freq; i++)
rank += sw->structure_map[ GET_STRUCTURE(posdata[i]) ] + meta_bias;
Rank is the sum each word's rank, where each word's rank is the
meta_bias plus its "structure" value, which is based on its position.
Then the log is taken of that number.
moseley@bumby:~$ swish-e -c c -i 2.html 3.html -T indexed_words -v0
Adding:[1:swishdefault(1)] 'title' Pos:2 Stuct:0x7 ( HEAD TITLE FILE )
Adding:[1:foo(10)] 'testword' Pos:5 Stuct:0x85 ( META HEAD FILE )
Adding:[1:swishdefault(1)] 'body' Pos:8 Stuct:0x9 ( BODY FILE )
Adding:[2:swishdefault(1)] 'title' Pos:2 Stuct:0x7 ( HEAD TITLE FILE )
Adding:[2:bar(11)] 'testword' Pos:5 Stuct:0x85 ( META HEAD FILE )
Adding:[2:swishdefault(1)] 'body' Pos:8 Stuct:0x9 ( BODY FILE )
So "testword" has a structure of 0x85 mening it's in a file (duh) and
it's in the <head> section and it's also in a <meta> tag. <head> is not
used.
Then in config.h:
#define RANK_TITLE 7 // <title>
#define RANK_HEADER 5 // <h1> <h2>
#define RANK_META 3 // <meta> (or any <tag> for xml
#define RANK_COMMENTS 1 // <!-- comment -->
#define RANK_EMPHASIZED 0 // <em> <b> <strong> (or something like that)
So we have a <meta> so that's a value of three. But one is added to
that so all words have at least a point value of one for the structure.
So it should be 3 + 1 = 4.
Again, the total rank is 1 + total of all the word positions. In this
case that's 1 + 4 = 5.
Let's try:
moseley@bumby:~$ (cd swish-e && make clean && ./configure CFLAGS='-DRAW_RANK -DDEBUG_RANK' && make) >/dev/null
moseley@bumby:~$ swish-e/src/swish-e -w bar=testword or foo=testword -H0
File num: 2. Raw Rank: 5. Frequency: 1 scaled rank: 16094
Structure tally:
struct 0x85 = 1 ( META HEAD FILE ) x rank map of 4 = 4
File num: 1. Raw Rank: 15. Frequency: 1 scaled rank: 27081
Structure tally:
struct 0x85 = 1 ( META HEAD FILE ) x rank map of 4 = 4
270 2.html "Title" 107
160 3.html "Title" 107
The scaled rank is * 100.
Clear as a bell, no? ;)
--
Bill Moseley
moseley@hank.org
Received on Tue Sep 9 01:20:14 2003