Skip to main content.
home | support | download

Back to List Archive

Re: swish-e core dump while indexing XML documents

From: Bill Schell <friedfish(at)not-real.optonline.net>
Date: Wed Apr 07 2004 - 20:51:53 GMT
I finally got back to this problem after being pulled off on something else
for a couple weeks.

There is a bug on line 253 of src/file.c (in swish-e-2.4.2).
That line reads:
buffer[bytes_read+1] = '\0';  /* hopfully doesn't read more than filelen bytes;) */

it should read:
buffer[bytes_read] = '\0';  /* hopefully doesn't read more than filelen bytes;) */

bytes_read+1 is one byte past the end of the allocated buffer. With the fix,
I can now index the 100K or so documents I was indexing with no problem.
The open source tool 'valgrind' was very handy in locating this problem.
I recommend it.

Bill Schell


On Monday 22 March 2004 21:35, Bill Schell wrote:
> OK, I trimmed down the list of source files, but the fault seems unrelated
> to a
> particular source file.  In the example, I was indexing two directories
> (030106 and 030107).  So, I moved 2/3 of the files from 030106 to 030106.2
> (since it died processing 030106).  Then I ran  swish-e -i 030106 030107.
> Works fine, no problem.   Then I run swish-e -i 030106.2 030107.  It dies,
> but while processing 030107!   The stack trace is similar to the previous
> one, with malloc getting confused and dying, but it is being called from a
> different place.
>
> (gdb) where
> #0  0x4024f47e in malloc_consolidate () from /lib/libc.so.6
> #1  0x4024ed83 in _int_malloc () from /lib/libc.so.6
> #2  0x4024df1a in malloc () from /lib/libc.so.6
> #3  0x4018711e in emalloc (size=1076930976) at mem.c:80
> #4  0x40187209 in allocChunk (size=262144) at mem.c:566
> #5  0x401872d6 in Mem_ZoneAlloc (head=0x80c7e88, size=9676) at mem.c:616
> #6  0x0805afed in read_stream (sw=0x80c3f78, fprop=0x814e298, is_text=1) at
> file.c:249
> #7  0x08055b0c in do_index_file (sw=0x80c3f78, fprop=0x814e298) at
> index.c:850 #8  0x080505ce in printfile (sw=0x80c3f78,
>     filename=0x814e298 "\230�r\bX�\b\b�024\b�r\b_�\b�") at fs.c:601
> #9  0x08050683 in printfiles (sw=0x80c3f78, e=0x80df278) at fs.c:642
> #10 0x08050276 in indexadir (sw=0x80c3f78, dir=0x80d7d60 "030107") at
> fs.c:445 #11 0x0805af86 in indexpath (sw=0x80c3f78, path=0x80d7d60
> "030107") at file.c:217
> #12 0x0804d098 in cmd_index (sw=0x80c3f78, params=0x80d1f70) at
> swish.c:1351 #13 0x0804bbb8 in main (argc=9, argv=0xbffff594) at
> swish.c:209
> #14 0x401ecd06 in __libc_start_main () from /lib/libc.so.6
>
> Seems defintely like malloc arena corruption coming from someplace.
> I put a printf in the readdir() loop in indexadir to print out each file
> name that is going to pass through to printfile(s).   I noticed that one of
> the entries printed as "(null)".  Sounds like a problem...more debugging
> tomorrow.
>
> Bill Schell
>
> ---------------------------------------------------------------------------
>------------------------------------------------------
>
> > Program received signal SIGSEGV, Segmentation fault.
> > [Switching to Thread 16384 (LWP 6498)]
> > 0x4024ef2d in _int_malloc () from /lib/libc.so.6
> > (gdb) where
> > #0  0x4024ef2d in _int_malloc () from /lib/libc.so.6
> > #1  0x4024e78c in calloc () from /lib/libc.so.6
> > #2  0x401ade76 in zcalloc () from /usr/lib/libz.so.1
> > #3  0x401aa5b7 in deflateInit2_ () from /usr/lib/libz.so.1
> > #4  0x401aa40a in deflateInit_ () from /usr/lib/libz.so.1
> > #5  0x401a8b0d in compress2 () from /usr/lib/libz.so.1
> > #6  0x080614bf in compress_property (prop=0x82d2a98, sw=0x80c3f78,
> > buf_len=0xbffff364,
> >     uncompressed_len=0xbffff368) at docprop_write.c:163
>
> When I see a segfault in malloc I wonder if there isn't memory
> corruption happening someplace else.
>
> [...]
>
> > #7  0x0806139d in WritePropertiesToDisk (sw=0x80c3f78, fi=0xbffff3b0)
> >     at docprop_write.c:100
> > #8  0x08055c2c in do_index_file (sw=0x80c3f78, fprop=0x82228a8) at
>
> index.c:994
>
> > #9  0x080505ce in printfile (sw=0x80c3f78, filename=0x82228a8
> > "???r\bh(\"\b???\r\b")
> >     at fs.c:601
>
> That's and odd file name.  Maybe that's a result of the memory
> corruption.
>
> Thanks for this great debugging session:
> > #6  0x080614bf in compress_property (prop=0x82d2a98, sw=0x80c3f78,
> > buf_len=0xbffff364,
> >     uncompressed_len=0xbffff368) at docprop_write.c:163
> > 163         zlib_status = compress2( (Bytef *)PropBuf, &dest_size,
> > prop->propValue, prop->propLen, sw->PropCompressionLevel);
> > (gdb) p *prop
> > $1 = {propLen = 200, propValue = "Q"}
>
> So these numbers look ok.
>
> > (gdb) p *buf_len
> > $3 = 15
> > (gdb) p *uncompressed_len
> > $4 = 0
> > (gdb)
>
> Is it possible to trim down your source files to (hopefully) just a few
> files that cause the segfault and make them available?  Looks like
> something is overrunning some memory.
>
> You might set a break point for printfile and watch the file names.
> Might be able to work back from there and see when that gets corrupted.
> Assuming you don't have a file named ???r\bh(\"\b???\r\b.
Received on Wed Apr 7 13:51:54 2004