Hi Bas
On 19 Sep 2000, at 5:23, Bas Meijer wrote:
> Hi,
>
>
> Swish-e 1.3.x has a -D flag for decompressing indexfiles to stdout.
> A lot of numbers pass and lines with somthing like this format:
> word: num num num num ...
>
> Is this flag still supported in 2.0.1? (I still need to upgrade
> lookup to that version).
>
> Does anyone know what these numbers mean? I hope they can
be usefull
> for an idea i have in post-processing the index. At this point I
have
> only made dictionary.cgi which allows you to browse the words
in the
> index in an alphabetical way and search with them with
lookup.cgi.
>
Yes, it is still supported. Now, it gives more info:
Try swish-e -v 4 -D test.index
Ignore OFSETS INFO and HASHOFFSETS INFO. I use them for
debugging.
You will see something like this for the words (WORD INFO part):
myword: Meta:1 ./test_meta.html Rank:5800 Strct:7 Freq:2 Pos:3
15
This means that the word "myword" is on MetaName 1 (No
MetaName), in file ./test_meta.html, has a rank of 5800, a
structure of 7 (like in 1.3.X). The frequency is 2 and the positions of
the word in the file are 3 and 15.
The same info without -v 4 will look like:
myword: 1 8 5800 7 2 3 15
8 is the filenumber.
BTW, I use the positions for implementing phrase search.
Of course, for a word you can have several sets of this info. In this
case, the information is sorted by metaname, filenumber.
All the words are sorted.
If you need more info let me know.
cu
Jose
Received on Tue Sep 19 15:31:28 2000