Skip to main content.
home | support | download

Back to List Archive

Large Merge Consuming too many resources

From: Tac <tac(at)>
Date: Mon Aug 30 2004 - 11:17:04 GMT
We've been running the indexing process on our production server (because
it's fast, same box as the database, etc.), but at times it consumes so many
resources it can bring our site temporarily down.  I'm trying to figure out
a way to let other processes have some time slices.  Two questions:
(1) Does swish-e index a record coming in via -S prog as it gets the record,
or does it wait until all records have been retrieved?  If it indexes at as
it gets it, I can add a sleep() command every few thousand records.
(2) The major problem is not during the individual collection indexing, but
during the 150+ index merge.  (Merging them all into 1 makes searches
against all the collections MUCH faster).  I don't think the bottleneck is
CPU, I think it's disk access, or maybe something else (file handles?). But
it's so intensive that even the MySQL box on that same quad-processor
machine locks up threads, unable to process any queries.   Any ideas for how
to get around this?  We're looking at moving the indexing to another  box,
but that requires getting the data from across the network, indexing and
merging, then ftp'ing the indices back, which seems like a lot of work.  If
there's a way to not have swish-e consume all the resources it'd be much
easier to keep it where it is.

Due to deletion of content types excluded from this list by policy,
this multipart message was reduced to a single part, and from there
to a plain text message.
Received on Mon Aug 30 04:17:34 2004