Skip to main content.
home | support | download

Back to List Archive

Re: more out of memory fun

From: Bill Moseley <moseley(at)not-real.hank.org>
Date: Fri Oct 20 2006 - 03:35:34 GMT
On Thu, Oct 19, 2006 at 10:19:28PM -0400, Brad Miele wrote:
> one question about debugging with gdb, what do i do :)? Sorry, i know how 
> to do gdb swish-e, and then run <switches and whatnot> but what do i do 
> after the crash to get more info?

I'm way rusty.

Depends on how hard it crashes.  Basically, you get a backtrace (bt)
where it segfaults.  Then you look back though and try and see what
was happening where and if it makes sense.  Normally it doesn't.  If
it crashes hard then you may not even get a backtrace that makes any
sense.  From there you set breakpoints and watch variables to try and
track down the problem.  At one point I knew most of the indexing
code, but I would need to completely relearn it to be able to make
quick work of tracking down a segfault.  The bummer is in your case it
takes so long to happen.

> finally, why do i need to use -e when i have so many resources? when 
> swish-e gave that out of memory error, i still had over 2G totally free 
> via top.

32bit limit in swish?  I doubt there's correct integer overflow
detection.

Swish uses hash tables and the larger they get the slower access to
the table is.  Remember, swish was designed for indexing thousands or
tens of thousands of files.  It's very fast at that.  The trade-off is
it's not that scalable.

I generated a million random docs once and -e was much slower at first
but kept running at a reasonably steady pace, and without -e it was
way faster for the first 100K files or so and then started slowing
down as the hash tables filled.  -e ended up being faster.



-- 
Bill Moseley
moseley@hank.org

Unsubscribe from or help with the swish-e list: 
   http://swish-e.org/Discussion/

Help with Swish-e:
   http://swish-e.org/current/docs
   swish-e@sunsite.berkeley.edu
Received on Thu Oct 19 20:35:37 2006