Skip to main content.
home | support | download

Back to List Archive

Re: fhash.c bug?

From: Peter Karman <peter(at)not-real.peknet.com>
Date: Tue Mar 29 2005 - 15:03:45 GMT
I think we're seeing two different errors, both with incremental mode.

testing on RedHat 7.2, using gcc 3.4.2 and kernel 2.4.20-28.7
and your test script, I also get consistent core dumps at around 230-240 loops.
the syserror I get is a 'segmentation fault' which doesn't exactly match what I 
was originally seeing (which was a hung loop in uncompress1).

testing on OS X with same script, I also got fatal errors at around 12 loops 
(interesting: much faster fatalities). The syserr is 'Bus error' -- again not 
the same error that I was originally seeing.

On both Linux and OSX, the error happens when 'Writing word text' after all the 
files have been indexed.

when I removed the -u (update) option from the loop test and just rebuilt the 
index each time, it did *not* core dump.

So I'm not sure that we're having the same problem. Which is too bad, since that 
would mean there are at least two bugs at play (or one bug showing diverse 
symptoms).

seems like your script shows a flaw in the -u option, a flaw which seems fairly 
re-producable.

My problem seems to be something else as-yet-unnamed. I could not duplicate the 
problem on Linux with the 12 files I sent you -- so I copied the 12 back to OS X 
and tried it again and could not duplicate it there either! But when I tried the 
original 12, in the original dir, on OSX, I get the error (hang in uncompress1) 
everytime. (Just for sanity, I just rebuilt latest swish-e on OSX using latest 
zlib 1.2.2 and get same error. OSX ships with zlib 1.1.3 so I wondered if there 
was a zlib problem somewhere.)



Dobrica Pavlinusic scribbled on 3/28/05 9:44 AM:
> On Mon, Mar 28, 2005 at 07:02:24AM -0800, Peter Karman wrote:
> 
>>>What is your smallest fileset on which you can demonstrate problem?
>>
>>I can name that tune in 12 files. I have put a tar at 
>>http://peknet.com/swish/12badfiles.tar.gz
>>
>>can you duplicate the error with that set? for me it consistently hangs on 
>>'acl_size.3c.html'.
> 
> 
> Strange. It works for me just fine. I used latest CVS version compiled
> with:
> 
> $ ./configure --disable-docs --with-pcre --enable-incremental --disable-shared
> 
> I indexed it using:
> 
> $ swish-e -S fs -f 1/test -i karman/
> 
> And it worked for me. Then I tried
> 
> $ swish-e -S fs -f 1/test -i karman/ -u
> 
> which also worked. However, than I remembered that I had problems with
> repeated indexing of same data, so I wrote following script (in-lined so
> that list doesn't remove it):
> 
> #!/bin/sh
> 
> ./swish-e -S fs -i karman -f 1/bug 2>/dev/null
> nr=0
> while true ; do
> 	nr=`expr $nr + 1`
> 	echo "LOOP: $nr"
> 	touch karman/* && ../swish-e -S fs -i karman -f 1/bug -u 2>/dev/null || exit
> done
> 
> After about 500 loops it always dumps core on me. It doesn't dump
> core on same loop, however.
> 
> I also noticed that "unique words indexed" count get incremented from
> time to time. Recompiling with --enable-psortarray didn't help.
> 
> Well, now we have semi-reproducible bug. I do need to finish various
> other stuff for tomorrow, so further investigation is pending for now.
> 

-- 
Peter Karman  .  http://peknet.com/  .  peter(at)not-real.peknet.com
Received on Tue Mar 29 07:04:00 2005