yes, in fact I don't want to use the indexing feature
of swish, because I should do that on my own, this is
my thesis.
I just want some features, if exists, that will be
hepful in creating document vectors to compare to the
query vectors. For example, stemming seems like a very
complex topic to me, and I think swish has a feature
like this. that's why I want help from you.
thanks.
--- Bill Moseley <moseley@hank.org> wrote:
> At 06:41 AM 2/25/2002 -0800, Gaye Karagulle wrote:
>
>
> >I am going to develop a library program in visual
> >basic, that does indexing using "vector space
> model"
> >and I need to find the words and their
> corresponding
> >frequencies, of each document in my database, in
> order
> >to create vectors for each document. And stemming
> >should be done meanwhile, namely, "run" "runs" and
> >"running"..etc should be counted as the same word.
> The
> >word frequencies will be used as weigts in the
> >document vectors.
> >
> >can I create these document vectors using swish-e?
> if
> >yes how?
>
> Not sure I'm following what you want. Doesn't sound
> like you need a search
> engine.
>
> Do you need to find the documents or are you just
> interested in word
> frequency?
>
> If just frequency then I'd probably just parse,
> stem, and tally up the
> counts. Not sure why you would need swish.
>
> With swish you can use some of the -T options to
> dump the index which will
> probably give you word counts, I suppose. -T
> index_words_full will tell
> you the frequency of each word, but it's a lot of
> output to parse.
>
>
>
> Bill Moseley
> mailto:moseley@hank.org
__________________________________________________
Do You Yahoo!?
Yahoo! Sports - Coverage of the 2002 Olympic Games
http://sports.yahoo.com
Received on Mon Feb 25 15:11:11 2002