Skip to main content.
home | support | download

Back to List Archive

Re: indexing and windows - my problem

From: Gaye Karagulle <gkaragulle(at)not-real.yahoo.com>
Date: Mon Feb 25 2002 - 15:10:33 GMT
yes, in fact I don't want to use the indexing feature
of swish, because I should do that on my own, this is
my thesis. 

I just want some features, if exists, that will be
hepful in creating document vectors to compare to the
query vectors. For example, stemming seems like a very
complex topic to me, and I think swish has a feature
like this. that's why I want help from you.

thanks.

  

--- Bill Moseley <moseley@hank.org> wrote:
> At 06:41 AM 2/25/2002 -0800, Gaye Karagulle wrote:
> 
> 
> >I am going to develop a library program in visual
> >basic, that does indexing using "vector space
> model"
> >and I need to find  the words and their
> corresponding
> >frequencies, of each document in my database, in
> order
> >to create vectors for each document. And stemming
> >should be done meanwhile, namely, "run" "runs" and
> >"running"..etc should be counted as the same word.
> The
> >word frequencies will be used as weigts in the
> >document vectors.
> >
> >can I create these document vectors using swish-e?
> if
> >yes how?
> 
> Not sure I'm following what you want.  Doesn't sound
> like you need a search
> engine.
> 
> Do you need to find the documents or are you just
> interested in word
> frequency?
> 
> If just frequency then I'd probably just parse,
> stem, and tally up the
> counts.  Not sure why you would need swish.  
> 
> With swish you can use some of the -T options to
> dump the index which will
> probably give you word counts, I suppose.  -T
> index_words_full will tell
> you the frequency of each word, but it's a lot of
> output to parse.
> 
> 
> 
> Bill Moseley
> mailto:moseley@hank.org


__________________________________________________
Do You Yahoo!?
Yahoo! Sports - Coverage of the 2002 Olympic Games
http://sports.yahoo.com
Received on Mon Feb 25 15:11:11 2002