At 10:46 AM 10/30/00 -0800, jmruiz@boe.es wrote:
>I forgot to mention one thing in my last post about 2.1.x.
>
>Finally, I have updated stemmer.c. Now it is thread safe
>and it also is independent of the string length or any other
>buffer length assumption.
>
>I have made a small modification of the function. Now the call
>changes how the paramater is passed. Now it looks like:
>
>int Stem( word, lenword ) /* redefined - Moseley 10/17/99 */
> char **word; /* in/out: the word stemmed */
> int *lenword; /* in/out: the length of word stemmed */
>
>So if more memory is neded, word buffer would be reallocated.
>
>Eg:
>
>char *myword;
>int mywordlen;
>
>myword=emalloc(6);
>strcpy(myword,"hello");
>Stem(&myword,&mywordlen);
>
>So, if the stemmed word needs 7 bytes, the Stem function will
>reallocate the buffer.
Humm. I'm not sure I understand the issue. Maybe there's two issues.
Here's how it used to look:
int Stem_it( word, wordlen )
char *word; /* in/out: the word stemmed */
int wordlen; /* in: length of word, to avoid strcat overflow */
{
int rule; /* which rule is fired in replacing an end */
/* Hack to make sure Stem() doesn't stem the word into nonexistence */
char saveword[MAXWORDLEN];
if ( wordlen != MAXWORDLEN ) return( TRUE );
/* semi-graceful abort - SRE - 2/00 */
strcpy( saveword, word );
First, I don't understand why wordlen needed to be passed in in the first
place. In search.c it just calls the the stemmer like this:
Stem(word, MAXWORDLEN);
So I'm missing the point of passing MAXWORDLEN just to check that it still
is the same value after the call. Seems like that's saying:
if ( 2 != 2 ) { printf("we have a problem"); }
Now, I'm not clear on the change you are talking about now. Is it to
protect against a sistuation where a stemmed word requires more memory than
the nonstemmed word?
Seems like Stem() would be better as a function, where the passed parameter
isn't modified:
stemmed_word = Stem( word );
But I'm no C programmer.
Bill Moseley
mailto:moseley@hank.org
Received on Mon Oct 30 19:25:43 2000