I'm looking for help here from someone familiar with the format of the
index.
I'd like to determine the frequency of words in a document set, based on
the swish index.
I know that:
swish-e -T INDEX_WORDS
will give me a dump of all the words in the index. How do I read that
output?
Here's one example:
'text' [1 801 1 (1737/49)]
I assume 'text' is the word. How do I interpret the [] stuff? I assume
that those numbers account for word position, which document(s)
(swishfilenum ??) the word appears in, etc.
Thanks.
--
Peter Karman - Software Publications Engineer - Cray Inc
phone: 651-605-9009 - mailto:karman@cray.com
Received on Mon Mar 29 07:43:10 2004