Skip to main content.
home | support | download

Back to List Archive

Re: swish-e-2.1-dev-25 core-dumps; sparc64-linux,

From: Eric T. Jorgensen <ericj(at)not-real.eskimo.com>
Date: Sat Feb 23 2002 - 04:28:10 GMT
On Fri, 22 Feb 2002, David L Norris wrote:

> On Fri, 2002-02-22 at 20:31, Eric T. Jorgensen wrote:
> > --- SIGBUS (Bus error) ---
> > +++ killed by SIGBUS +++
> > -----
> 
> One of several things:
> 
>   1. You have bad RAM or a flaky memory/CPU bus.


Can't rule it out too quickly, of course, but the same server runs Apache
(1.3.17) and this was happening both during and after 'busy' times with no
errors coming up in its logs or other dumps... and older swish-e 2.0.5
compiles/tests/runs correctly, including on 'real' web indexes (apart from
the loop glitch I mentioned yesterday).

As the other error still happens in that version, I took out the particular
parent URL that was bringing it up for some reason, but would like to find
a way to not opt-out web dirs that don't want to be opted-out.  :)


>   2. You are running out of RAM (quite possible).  Linux generates a
> SIGBUS (and kills) on an out of RAM condition (rather than allowing
> malloc() to return a failure code possibly blowing up the program
> without explanation).  Try running your tests on a small subset of
> documents.  (Or were you already?)


Shouldn't be; I saw over 37MB free RAM, etc., before and after running
the test.  This was on the 'make test' after a fresh compile, with the
files:

% ls -l tests
total 32
-rw-------   1 ericj    staff         443 Sep  1 00:25 test.config
-rwx------   1 ericj    staff         437 Dec 12  2000 test.html
-rw-------   1 ericj    staff          55 Dec  3  2000 test.txt
-rwx------   1 ericj    staff         126 Jul 26  2001 test.xml
-rwx------   1 ericj    staff         233 Dec 22  2000 test_meta.html
-rwx------   1 ericj    staff         264 Dec 22  2000 test_meta2.html
-rwx------   1 ericj    staff         159 Jan 11  2001 test_phrase.html
-rwx------   1 ericj    staff         159 Dec  3  2000 test_xml.html

Also, since I didn't compile in the libxml2, I tried removing the 'xml'
refs in the test.config (to not index the 'test.xml' and 'test_xml.html'
files), but had the same cores.

Immediately afterward, approx. same amount of free RAM, 2.0.5 ran fine on a
'live' index of a subset of URLs.


>   3. SWISH-E could be reading a memory address but isn't aligned along
> it's boundry.  SPARC-Linux has some unique issues because it requires
> aligned memory access.  I've not had to deal with this.  But, I suspect
> maybe someone on the list has.


Sounds like it'd probably be this.  Thanks for the quick response of
ideas, though.  :)

~ Eric
  ericj@eskimo.com
Received on Sat Feb 23 04:28:45 2002