Skip to main content.
home | support | download

Back to List Archive

Re: [swish-e] Segmentation fault for query from Perl-API for 2.4.7

From: at <Kruno>
Date: Tue, 05 Nov 2013 14:46:08 +0100
Am 04.11.2013 16:55, schrieb Kruno Sever:
> Hi!
> I am still searching for the problem, why certain 64bit SWISH-E queries
> on my standard Debian 7 lead to segmentation faults.
> After sprinkling search.c with debugging output, it seems to be closing
> in on the functions uncompress_location_values() and
> uncompress_location_positions() in getfileinfo().
> I will continue to look into this tomorrow, but maybe someone here has a
> good idea until then, on how I can proceed.
> My problem index has been reduced to an archive of about 500KB of xml
> files, which I could make available on request, in case you are willing
> to help me out.

Okay, some new insights here, mostly for those familiar with the 
sources: first I discovered that accessing my index with "-T 
INDEX_WORDS" also results in a segmentation fault. Assuming this is the 
same problem, I then managed to reduce my problem index to about 185KB 
(compressed tgz), which I can easily mail to anyone who wants to have a 
look at this.

Using "-T INDEX_WORDS_FULL" gives an error message involving the 
compression routines (no segfault!):

err: _c is < 0 in uncompress1()

which seems to indicate an EOF, when reading from the index.

After recompiling with -DDEBUG_PROP I got more extensive debugging info. 
For the last file before the error, I notice this strange output:

    readfileoffset -1662937669732139008
    PropIDX: 0  data[Seek: 1043688] at seek 924088 read 8 bytes (one 
    readfileoffset 7521962920874570504
    PropIDX: 1  data[Seek: 608939674771219304] at seek 924096 read 8 
bytes (one readlong)

where the [Seek: ...] value suddenly changes into an abnormally large 
value. In the lines before it was of comparable magnitude to the "small" 
seek value. The error is printed only after all PropIDX values are shown 
for this file, all with questionable Seek: values.

The readfileoffset lines were added by me, they represent the values 
(sw_off_t) actually read from the property file in readfileoffset() 
before the UNPACKFILEOFFSET() is applied to convert this into the Seek: 
value printed.

I also noted that I have to regenerate the index to be able to reproduce 
this behaviour, it appears the segfault is capable of somehow corrupting 
the index. For a time I suspected a corrupt hard disk, but these 
segfaults are reproducible across different machines.

Anyone able to see a possible cause or how to fix this? While it is 
kinda interesting digging into the SWISH-E Code I would really 
appreciate some help.

Users mailing list
Received on Tue Nov 05 2013 - 13:46:03 GMT