Skip to main content.
home | support | download

Back to List Archive

swish-e-2.0.5; intermittent loop in search and merge

From: Eric T. Jorgensen <ericj(at)not-real.eskimo.com>
Date: Fri Feb 22 2002 - 05:54:08 GMT
Just compiled swish-e-2.0.5 on sparc64-Linux (2.2.17-RAID); './configure',
'make', 'make test' all without errors.  Running HTTP method to index
small chunks at a time, then merge them to a larger database, but coming up
with a endless-loop, but only *sometimes*...

Sample files...

-rw-r--r--   1 ericj    staff      158642 Feb 21 17:09 swish.2
-rw-r--r--   1 ericj    staff     1048472 Feb 21 17:32 swish.4
-rw-r--r--   1 ericj    staff      873571 Feb 21 18:23 swish.x
-rw-r--r--   1 ericj    staff       67552 Feb 21 18:07 swish.y


Sample command...

% swish-e -S http -c conf.x
..(writes to 'swish.x')...

% swish-e -w testing -f swish.x
(works well, no loop)

% swish-e -w test -f swish.x
(loops with data below)


Strangely *both* searches ('test' and 'testing') work correctly in the
other files (swish.2, swish.4, swish.y) with either 'results' or 'no
results' outputs.  On 'swish.x', however...

	# Swish-e format 2.0
	# 
	# Name: Index of http://www.eskimo.com/
	# Saved as: swish.x
	# Counts: 14763 words, 178 files
	...
	# IgnoreFirstChar: "'(
	# IgnoreLastChar: "'),.;
	# SWISH format 2.0
	# Search words: test

At which point it hangs.  Running this through strace brings up (one
username changed to '[edit]'):

-----
..
_llseek(0x3, 0, 0, 0xefffee30, 0)       = 0
_llseek(0x3, 0, 0xd0000, 0xefffee30, 0) = 0
read(3, "b&\t\1\227\26\0\6yields\224\315\f\0\1\35\0\r\0}M\t\1\211"..., 8108) = 8108
read(3, ",http://www.eskimo.com/support/a"..., 8192) = 8192
_llseek(0x3, 0, 0, 0xefffee30, 0)       = 0
_llseek(0x3, 0, 0xd4000, 0xefffee30, 0) = 0
read(3, ".com/~[edit]/chs1986/addguest.ht"..., 389) = 389
read(3, "%http://www.eskimo.com/~[edit]/n"..., 8192) = 4830
brk(0x8c000)                            = 0x8c000
read(3, "", 40960)                      = 0
read(3, "", 8192)                       = 0
read(3, "", 8192)                       = 0
read(3, "", 8192)                       = 0
read(3, "", 8192)                       = 0
-----
(last line ad infinitum)


Searching for 'testing' on the same database ends with (starting at the
same point):

	# Swish-e format 2.0
	# 
	# Name: Index of http://www.eskimo.com/
	# Saved as: swish.x
	# Counts: 14763 words, 178 files
	...
	# IgnoreFirstChar: "'(
	# IgnoreLastChar: "'),.;
	# SWISH format 2.0
	# Search words: testing
	# Number of hits: 2
	[edited...]
	[edited...]
	.

and...

-----
..
_llseek(0x3, 0, 0, 0xefffee30, 0)       = 0
_llseek(0x3, 0, 0xd0000, 0xefffee30, 0) = 0
read(3, "b&\t\1\227\26\0\6yields\224\315\f\0\1\35\0\r\0}M\t\1\211"..., 8001) = 8001
read(3, "\34http://www.eskimo.com/swish/\23SW"..., 8192) = 8192
_llseek(0x3, 0, 0, 0xefffee30, 0)       = 0
_llseek(0x3, 0, 0xd4000, 0xefffee30, 0) = 0
read(3, ".com/~[edit]/chs1986/addguest.ht"..., 1240) = 1240
read(3, "\'http://www.eskimo.com/~[edit]/res"..., 8192) = 3979
close(3)                                = 0
munmap(0x70024000, 8192)                = 0
write(1, "# Swish-e format 2.0\n# \n# Name: "..., 1116) = 1116
munmap(0x70022000, 8192)                = 0
exit(0)                                 = ?
-----
(back to shell prompt)


Since 'brk' is a memory structure (change data segment size), I made sure
my ulimits were unlimited:

% ulimit -a
time(cpu-seconds)    unlimited
file(blocks)         unlimited
coredump(blocks)     1000000
data(kbytes)         unlimited
stack(kbytes)        unlimited
lockedmem(kbytes)    unlimited
memory(kbytes)       unlimited
nofiles(descriptors) 1024
processes            2028



The same "brk...read...read...read...read..." loop happens during a -M
merge using that swish file.

It's a stumper at the moment.  Any ideas out there?

It would be a "production" environment eventually, so I'd prefer to stick
with the "stable" one (although today's 2.1 devel version compiled, it
failed 'make test's filesystem test).

Am trying 'swish-e' as the older 'swish' dumps core during some filesystem
indexing as well (although I haven't tried that with the unlimited 'stack'
size).


Thanks for any ideas...

~ Eric
  ericj@eskimo.com
Received on Fri Feb 22 05:54:30 2002