At 07:37 AM 12/28/01 -0800, Jean-François PIÉRONNE wrote:
>As you can see there is a big win to increase the *HASHSIZE parameters
>
>So, IMHO, it would be better to default the three HASHSIZE using the
following
>setting
>HASHSIZE 1009
>BIGHASHSIZE 10001
>SEARCHHASHSIZE 100003
I don't have a lot of data to work with, but here's my test with the
different settings:
Old hash sizes:
PID USERNAME PRI NICE SIZE RES STATE C TIME WCPU CPU COMMAND
16203 moseley 64 0 324M 323M CPU1 1 9:43 97.95% 97.95% swish-e
682368 unique words indexed.
2 properties sorted.
38840 files indexed. 457923344 total bytes. 19964931 total words.
Elapsed time: 00:12:47 CPU time: 00:09:45
Indexing done!
(Sure would be nice to have some comma's in those numbers...)
-rw-r--r-- 1 moseley moseley 70685718 Dec 30 17:19 index.swish-e
-rw-r--r-- 1 moseley moseley 44974080 Dec 30 17:19 index.swihs-e.prop
Now, using your suggested values:
PID USERNAME PRI NICE SIZE RES STATE C TIME WCPU CPU COMMAND
18774 moseley 64 0 325M 325M CPU0 1 7:44 96.29% 96.29% swish-e
682368 unique words indexed.
2 properties sorted.
38840 files indexed. 457923344 total bytes. 19964931 total words.
Elapsed time: 00:10:18 CPU time: 00:07:44
Indexing done!
-rw-r--r-- 1 moseley moseley 71045726 Dec 30 17:32 index.swish-e
-rw-r--r-- 1 moseley moseley 44974080 Dec 30 17:32 index.swish-e.prop
Hard to really measure with just one run each on a busy machine and such a
small amount of data, but something significant.
I've just now committed the changes to cvs. It will be interesting to see
if anyone else notices any improvement.
Thanks for the help!
--
Bill Moseley
mailto:moseley@hank.org
Received on Mon Dec 31 01:51:14 2001