On Fri, Dec 28, 2007 at 10:05:43PM -0600, Eric Jobidon wrote:
> index[resolved]
>
> Is it appropriate to interpret the "position data" as a page number? So
> "(5/9)" would indicate that the word occurs (at least once) on page 5 of a
> nine page document?
No, it's the word position in the document and the "structure" which
indicates where in a (html) file the word is found.
The word position isn't of much use relating to the source document.
Swish uses it for phrase matching.
$ cat 1
hello hello there
$ swish-e -T indexed_words -v0 -i 1
Adding:[1:swishdefault(1)] 'hello' Pos:5 Stuct:0x9 ( BODY FILE )
Adding:[1:swishdefault(1)] 'hello' Pos:6 Stuct:0x9 ( BODY FILE )
Adding:[1:swishdefault(1)] 'there' Pos:7 Stuct:0x9 ( BODY FILE )
$ swish-e -T index_words
-----> WORD INFO in index index.swish-e <-----
hello [1 1 2 (5/9 6/9)]
there [1 1 1 (7/9)]
$ swish-e -T index_words_full
-----> WORD INFO in index index.swish-e <-----
hello
Meta:1 1 Freq:2 Pos/Struct:5/9,6/9
there
Meta:1 1 Freq:1 Pos/Struct:7/9
--
Bill Moseley
moseley@hank.org
Unsubscribe from or help with the swish-e list:
http://swish-e.org/Discussion/
Help with Swish-e:
http://swish-e.org/current/docs
_______________________________________________
Users mailing list
Users@lists.swish-e.org
http://lists.swish-e.org/listinfo/users
Received on Sat Dec 29 00:14:35 2007