Follow up:
I've had a ~ 50 file merge running 400+ cpu minutes at 98-99% of cpu
time. It has completed only 20 of the 50 files and now has a memory
footprint of 325 megs. At the rate it's growing I'll run out of
memory before it completes (500m).
> > On Mon, 15 Jul 2002, Michael wrote:
> > > merges are slower than 2.05 -- hard to say quantitatively now. My
> > > guess is around 2:1 for files as above at least. I used to be able to
> > > merge the whole thing in a day or two but based on the benchmark of
> > > above, it would take 80-100 hrs or more.
> >
> > I really can not remember. IIRC, merge in 2.1-dev works like it did
> > in 2.05, but 2.1's indexing is much faster and uses less RAM. Merge
> > in 2.1 does not take advantage of all the compression features of
> > normal indexing. Hopefully that will be fixed sometime...
> >
> > The index format is different with 2.1, so that might be one reason
> > it's slower than 2.05.
> >
> > You have two options over merging. One is to index everything at
> > once, and the other is to specify more than one index file on the
> > command line when searching.
> >
> > And you can use -e if you are short on RAM.
> >
>
> Neither of those is particularly appealing. We have 4-5 years worth
> of data and accumulate new data daily. The indexes are broken down
> by directory per month so not merging would imply searching up to 50
> index files for a full search. Merging monthly is OK, but takes a
> LONGGGG time even for a one month add but since swish does not have
> an "exclude dir/file" switch, indexing the entire site minus the
> last two months directories is not doable, but is easily
> accomplished with a merge.
>
> I tried a whole site merge just to measure the time, it took about
> 45 minutes to create a 90meg index file in economy mode. -- LESS
> THAN A MERGE for 3megs + 30 megs which takes 90 minutes. This does
> not seem right.
>
> I'd like some suggestions...
>
> Michael
> Michael@Insulin-Pumpers.org
>
Michael@Insulin-Pumpers.org
Received on Tue Jul 16 00:15:50 2002