Thanks for the information. It made it very simple to test. I took your test files and ran my tests. I am running on windows 2000.
I did not get the same results.....
D:\joe\SWISH-E>
D:\joe\SWISH-E>type 1.html
<html>
<head><title>Title</title>
</head>
<body>
body
testword
</body>
</html>
D:\joe\SWISH-E>type 2.html
<html>
<head><title>Title</title>
<meta name="foo" content="testword">
</head>
<body>
body
</body>
</html>
D:\joe\SWISH-E>type c
Metanames foo
D:\joe\SWISH-E>swish-e -c c -i 1.html 2.html -v0
D:\joe\SWISH-E>swish-e -w testword -H0
1000 1.html "Title" 87 <---- not the same.....
D:\joe\SWISH-E>type 3.html
<html>
<head><title>Title</title>
<meta name="bar" content="testword">
</head>
<body>
body
</body>
</html>
D:\joe\SWISH-E>swish-e -c c -i 2.html 3.html -v0
D:\joe\SWISH-E>swish-e -w bar=testword -H0
err: Unknown metaname: 'bar' <-- ok did not have bar in the config file "c"...
.
D:\joe\SWISH-E>swish-e -w foo=testword -H0
1000 2.html "Title" 115
D:\joe\SWISH-E>type c
Metanames foo bar
MetaNamesRank 10 foo
D:\joe\SWISH-E>swish-e -w bar=testword -H0
err: Unknown metaname: 'bar' <--- added bar but still did not find it....
.
D:\joe\SWISH-E>swish-e -w foo=testword -H0
1000 2.html "Title" 115 <--- ok did not include 3.html....
D:\joe\SWISH-E>swish-e -h
usage:
swish [-e] [-i dir file ... ] [-S system] [-c file] [-f file] [-l] [-v (num)
]
swish -w word1 word2 ... [-f file1 file2 ...] \
[-P phrase_delimiter] [-p prop1 ...] [-s sortprop1 [asc|desc] ...] \
[-m num] [-t str] [-d delim] [-H (num)] [-x output_format]
swish -k (char|*) [-f file1 file2 ...]
swish -M index1 index2 ... outputfile
swish -N /path/to/compare/file
swish -V
options: defaults are in brackets
-S : specify which indexing system to use.
Valid options are:
"fs" - index local files in your File System
"http" - index web site files using a web crawler
"prog" - index files supplied by an external program
The default value is: "fs"
-i : create an index from the specified files
-w : search for words "word1 word2 ..."
-t : tags to search in - specify as a string
"HBthec" - in Head|Body|title|header|emphasized|comments
-f : index file to create or file(s) to search from [index.swish-e]
-c : configuration file(s) to use for indexing
-v : indexing verbosity level (0 to 3) [-v 1]
-T : Trace options ('-T help' for info
-l : follow symbolic links when indexing
-b : begin results at this number
-m : the maximum number of results to return [defaults to all results]
-M : merges index files
-N : index only files with a modification date newer than path supplied
-p : include these document properties in the output "prop1 prop2 ..."
-s : sort by these document properties in the output "prop1 prop2 ..."
-d : next param is delimiter.
-P : next param is Phrase delimiter.
-V : prints the current version
-e : "Economic Mode": The index proccess uses less RAM.
-x : "Extended Output Format": Specify the output format.
-H : "Result Header Output": verbosity (0 to 9) [1].
-k : Print words starting with a given char.
-E : Append errors to file specified, or stderr if file not specified.
version: 2.4.0-pr1 <-- version....
docs: http://swish-e.org
Scripts and Modules at: (libexecdir) = D:\joe\SWISH-E\lib\swish-e
D:\joe\SWISH-E>
Bill Moseley <moseley@hank.org> wrote:
On Mon, Sep 08, 2003 at 04:42:48PM -0700, intervolved none wrote:
> I am trying to get the MetaNamesRank working.I installed the windows
> 2.4... version, added the line "MetaNamesRank 10 keywords" to my
> config file, reindexed the site, and looked to see if there was any
> change in the indexing. There was not. It is like it ignored the
> configuration file setting. Am I supposed to make any more
> configuration changes for it to pick up the meta tag line from the
> html document?
>
> I am trying to get the meta tags in my html page to give more weight
> than the actual text... example : > help work" name=keywords>
meta tags do have more weight:
moseley@bumby:~$ cat 1.html 2.html
body
testword
body
moseley@bumby:~$ cat c
Metanames foo
moseley@bumby:~$ swish-e -c c -i 1.html 2.html -v0
moseley@bumby:~$ swish-e -w testword or foo=testword -H0
1000 2.html "Title" 107
431 1.html "Title" 79
.
Now try with two metanames:
moseley@bumby:~$ cat 3.html
body
See they have the same value here:
moseley@bumby:~$ swish-e -c c -i 2.html 3.html -v0
moseley@bumby:~$ swish-e -w bar=testword or foo=testword -H0
1000 3.html "Title" 107
1000 2.html "Title" 107
.
Now try changing the rank based on MetaNamesRank:
moseley@bumby:~$ cat c
Metanames foo bar
MetaNamesRank 10 foo
moseley@bumby:~$ swish-e -c c -i 2.html 3.html -v0
moseley@bumby:~$ swish-e -w bar=testword or foo=testword -H0
1000 2.html "Title" 107
592 3.html "Title" 107
.
> The meta tag line in the html should be weighted 10 times more than
> the words on the page, correct?
No, not really. Sure is a lot easier when you can build from source.
Here's the calcuation for each word:
for(i = 0; i < freq; i++)
rank += sw->structure_map[ GET_STRUCTURE(posdata[i]) ] + meta_bias;
Rank is the sum each word's rank, where each word's rank is the
meta_bias plus its "structure" value, which is based on its position.
Then the log is taken of that number.
moseley@bumby:~$ swish-e -c c -i 2.html 3.html -T indexed_words -v0
Adding:[1:swishdefault(1)] 'title' Pos:2 Stuct:0x7 ( HEAD TITLE FILE )
Adding:[1:foo(10)] 'testword' Pos:5 Stuct:0x85 ( META HEAD FILE )
Adding:[1:swishdefault(1)] 'body' Pos:8 Stuct:0x9 ( BODY FILE )
Adding:[2:swishdefault(1)] 'title' Pos:2 Stuct:0x7 ( HEAD TITLE FILE )
Adding:[2:bar(11)] 'testword' Pos:5 Stuct:0x85 ( META HEAD FILE )
Adding:[2:swishdefault(1)] 'body' Pos:8 Stuct:0x9 ( BODY FILE )
So "testword" has a structure of 0x85 mening it's in a file (duh) and
it's in the section and it's also in a tag. is not
used.
Then in config.h:
#define RANK_TITLE 7 //
---------------------------------
Do you Yahoo!?
Yahoo! SiteBuilder - Free, easy-to-use web site design software
*********************************************************************
Due to deletion of content types excluded from this list by policy,
this multipart message was reduced to a single part, and from there
to a plain text message.
*********************************************************************
Received on Tue Sep 9 23:31:14 2003