Unexpected index file size reduction

From: Lauren Landsburg <lauren(at)>
Date: Thu Sep 26 2002 - 16:50:43 GMT
I've been using swish-e for a number of years to index a site that has grown to close to over 4000 pages.

Suddenly, with the latest site additions a week or so ago, instead of increasing a few meg, the index file size dropped from 35.9 meg to 25.5 meg.  

The index files produced by swish-e had previously regularly increased in size with my additions to the website.  It takes about 2 hours to index the site using the http method.  We've upgraded to 2.2 in the last year, and for the most part the
changeover proceeded seamlessly: the time-to-create-the-index dropped as expected, and the index sizes continued to increase until this latest surprise.  Many of our files are substantial in size: over 150K of text.

At first I thought something must be drastically wrong---maybe the site had somehow lost pages and had to be restored!  However, some further checking suggested that nothing was missing, and that swish-e was indexing all pages as expected.  

What could produce such a precipitous drop in the index file size?  

The smaller index size is lovely, and I think, though I haven't tested this, that the indexing process ran faster.  If I inadvertantly did something that somehow balanced the file, I'd like to know!  Perhaps I could make further improvements along
those lines.


Received on Thu Sep 26 16:55:34 2002