This is a multi-part message in MIME format.
--------------020605080205000801020406
Content-Type: text/plain; charset=KOI8-R; format=flowed
Content-Transfer-Encoding: 7bit
Hello all.
According to the documentation, SwishFuzzyWordError() return values are
defined in src/stemmer.h file, and this is true.
Though, this fact actually makes it impossible to use these values because
stemmer.h is not a public header and used only internally.
Also, it's not really clear if one should use this function or it's not recommended/deprecated/etc.
The documentation of SwishFuzzyWordError() almost does not shed a light:
"Not all stemmers set this value correctly." - well, this means at least some of them
DO return correct values. That's better than nothing.
Maybe it's time to fix those returning incorrect values?
"But since SwishFuzzyWordList() will return a valid string regardless of the return value,
you can often just ignore this setting. That's what I do." - how often should I ignore it? =)
I mean, if the value of this function should be ignored, then the function itself is useless.
Hence the question:
Would you accept a patch exporting those constants to public (and changing the
function prototype appropriately) or should I forget about SwishFuzzyWordError()?
See diff against current CVS in attachment.
Thanks in advance.
--
Wbr,
Antony Dovgal
--------------020605080205000801020406
Content-Type: text/plain;
name="stemmer_constants.diff.txt"
Content-Transfer-Encoding: 7bit
Content-Disposition: inline;
filename="stemmer_constants.diff.txt"
? src/.deps
? src/.libs
? src/Makefile
? src/acconfig.h
? src/bash.lo
? src/check.lo
? src/compress.lo
? src/date_time.lo
? src/db_native.lo
? src/db_read.lo
? src/db_write.lo
? src/docprop.lo
? src/docprop_write.lo
? src/double_metaphone.lo
? src/entities.lo
? src/error.lo
? src/extprog.lo
? src/file.lo
? src/filter.lo
? src/fs.lo
? src/getruntime.lo
? src/hash.lo
? src/headers.lo
? src/html.lo
? src/http.lo
? src/httpserver.lo
? src/index.lo
? src/libswish-e.la
? src/libswishindex.la
? src/list.lo
? src/mem.lo
? src/merge.lo
? src/metanames.lo
? src/methods.lo
? src/parse_conffile.lo
? src/parser.lo
? src/pre_sort.lo
? src/proplimit.lo
? src/ramdisk.lo
? src/rank.lo
? src/result_sort.lo
? src/search.lo
? src/soundex.lo
? src/stamp-h1
? src/stemmer.lo
? src/swish-e
? src/swish2.lo
? src/swish_qsort.lo
? src/swish_words.lo
? src/swregex.lo
? src/swstring.lo
? src/txt.lo
? src/xml.lo
? src/expat/.deps
? src/expat/.libs
? src/expat/Makefile
? src/expat/libswexpat.la
? src/expat/xmlparse.lo
? src/expat/xmlrole.lo
? src/expat/xmltok.lo
? src/replace/.deps
? src/replace/.libs
? src/replace/Makefile
? src/replace/dummy.lo
? src/replace/libreplace.la
? src/snowball/.deps
? src/snowball/.libs
? src/snowball/Makefile
? src/snowball/api.lo
? src/snowball/libsnowball.la
? src/snowball/stem_de.lo
? src/snowball/stem_dk.lo
? src/snowball/stem_en1.lo
? src/snowball/stem_en2.lo
? src/snowball/stem_es.lo
? src/snowball/stem_fi.lo
? src/snowball/stem_fr.lo
? src/snowball/stem_it.lo
? src/snowball/stem_nl.lo
? src/snowball/stem_no.lo
? src/snowball/stem_pt.lo
? src/snowball/stem_ru.lo
? src/snowball/stem_se.lo
? src/snowball/utilities.lo
Index: src/libtest.c
===================================================================
RCS file: /cvsroot/swishe/swish-e/src/libtest.c,v
retrieving revision 1.16
diff -u -p -d -r1.16 libtest.c
--- src/libtest.c 12 May 2005 15:41:05 -0000 1.16
+++ src/libtest.c 28 Jan 2007 22:24:53 -0000
@@ -523,7 +523,7 @@ static void stem_it( SW_RESULT r, char *
printf(" [%s] : ", word );
fw = SwishFuzzyWord( r, word );
- printf(" Status: %d", SwishFuzzyWordError(fw) );
+ printf(" Status: %d", (int)SwishFuzzyWordError(fw) );
printf(" Word Count: %d\n", SwishFuzzyWordCount(fw) );
printf(" words:");
Index: src/stemmer.c
===================================================================
RCS file: /cvsroot/swishe/swish-e/src/stemmer.c,v
retrieving revision 1.30
diff -u -p -d -r1.30 stemmer.c
--- src/stemmer.c 12 Nov 2006 02:52:39 -0000 1.30
+++ src/stemmer.c 28 Jan 2007 22:24:54 -0000
@@ -495,12 +495,12 @@ int SwishFuzzyWordCount( FUZZY_WORD *fw
/* Returns the integer value of the error */
-int SwishFuzzyWordError( FUZZY_WORD *fw )
+STEM_RETURNS SwishFuzzyWordError( FUZZY_WORD *fw )
{
if ( !fw )
return -1;
- return (int)fw->error;
+ return fw->error;
}
/* Frees the word */
Index: src/swish-e.h
===================================================================
RCS file: /cvsroot/swishe/swish-e/src/swish-e.h,v
retrieving revision 1.15
diff -u -p -d -r1.15 swish-e.h
--- src/swish-e.h 14 Jul 2005 17:02:34 -0000 1.15
+++ src/swish-e.h 28 Jan 2007 22:24:54 -0000
@@ -57,6 +57,14 @@ typedef enum {
SWISH_HEADER_ERROR /* must check error in this case */
} SWISH_HEADER_TYPE;
+typedef enum {
+ STEM_OK,
+ STEM_NOT_ALPHA, /* not all alpha */
+ STEM_TOO_SMALL, /* word too small to be stemmed */
+ STEM_WORD_TOO_BIG, /* word it too large to stem, would would be too large */
+ STEM_TO_NOTHING /* word stemmed to the null string */
+} STEM_RETURNS;
+
typedef union
{
const char *string;
@@ -182,7 +190,7 @@ SW_FUZZYWORD SwishFuzzyWord( SW_RESULT r
SW_FUZZYWORD SwishFuzzify( SW_HANDLE sw, const char *index_name, char *word );
const char **SwishFuzzyWordList( SW_FUZZYWORD fw );
int SwishFuzzyWordCount( SW_FUZZYWORD fw );
-int SwishFuzzyWordError( SW_FUZZYWORD fw );
+STEM_RETURNS SwishFuzzyWordError( SW_FUZZYWORD fw );
void SwishFuzzyWordFree( SW_FUZZYWORD fw );
const char *SwishFuzzyMode( SW_RESULT r );
--------------020605080205000801020406--
Received on Sun Jan 28 23:52:55 2007