On Wed, May 07, 2003 at 05:18:58PM -0700, John Movius wrote:
> Does anyone have any stats on the relative size of a regular SWISH-e
> index vs. a fuzzy SWISH-e index? I realize this could vary
> considerably.
Here's another sample of about 10,000 entries using Stemming.
8170019 May 7 23:06 index.swish-e
1519304 May 7 23:06 index.swish-e.prop
8643319 May 7 23:09 index_no_stem.swish-e
1519304 May 7 23:09 index_no_stem.swish-e.prop
As you can see, not much different. One bummer is the .prop file is duplicated for each.
Would not be too much of a hack to get swish to create an index that included stemming and
non-stemming within the same index. Could just use metanames to store the different
versions of the same word internally.
--
Bill Moseley
moseley@hank.org
Received on Thu May 8 06:44:36 2003