Skip to main content.
home | support | download

Back to List Archive

INDEX_WORDS interpretation

From: Peter Karman <karman(at)not-real.cray.com>
Date: Mon Mar 29 2004 - 15:43:09 GMT
I'm looking for help here from someone familiar with the format of the 
index.

I'd like to determine the frequency of words in a document set, based on 
the swish index.

I know that:

swish-e -T INDEX_WORDS

will give me a dump of all the words in the index. How do I read that 
output?

Here's one example:

'text' [1 801 1 (1737/49)]

I assume 'text' is the word. How do I interpret the [] stuff? I assume 
that those numbers account for word position, which document(s) 
(swishfilenum ??) the word appears in, etc.

Thanks.
-- 
Peter Karman - Software Publications Engineer - Cray Inc
phone: 651-605-9009 - mailto:karman@cray.com
Received on Mon Mar 29 07:43:10 2004