Skip to main content.
home | support | download

Back to List Archive

Size of database

From: Nicolas Huillard <nhuillard(at)not-real.ghs.fr>
Date: Fri Oct 08 1999 - 15:04:36 GMT
Hello,

I've just read the discussion archive about the size of the database, because I'm trying to index as much as 200000 documents, for a benchmark, using 10000 differents documents.
I'm doing this in many steps :
* index these 10000 documents under 20 different names (in order for swish-e to index them as different docs),
* each bunch of 10000 is indexed in 4 subsets of 2500, then merged,
* then I merge the first 10000 to the second 10000, to have a second index of 20000,
* then the second index (merged to 20000) with the third of 10000, to have a 30000 docs index,
* and so on up to 200000 docs
* I keep each index for the multiples of 10000 docs, for the benchmark

The fact is that Swish-e 1.3.2 stops during the merging of the 30000 docs index, leaving two temp files :
-rw-rw-r--   1 nhuillar nhuillar 40279090 Oct  7 20:42 27633aaa
-rw-rw-r--   1 nhuillar nhuillar 70361575 Oct  7 20:34 27633baa
and not writing the merged file.
The already merged files are :
-rw-r--r--   1 nhuillar nhuillar 27450995 Oct  7 19:45 bench-00.swish-e
-rw-r--r--   1 nhuillar nhuillar 54561412 Oct  7 20:13 bench-01.swish-e
The first one has 10000 files and the second one 20000.

bench-00.swish-e is merged from :
-rw-r--r--   1 nhuillar nhuillar  7872284 Oct  7 12:55 bench-00-0001.swish-e
-rw-r--r--   1 nhuillar nhuillar  7326744 Oct  7 14:53 bench-00-2501.swish-e
-rw-r--r--   1 nhuillar nhuillar  8017522 Oct  7 16:47 bench-00-5001.swish-e
-rw-r--r--   1 nhuillar nhuillar  4972033 Oct  7 18:40 bench-00-7501.swish-e

bench-01.swish-e is merged from bench-00.swish-e and :
-rw-r--r--   1 nhuillar nhuillar  7872284 Oct  7 13:19 bench-01-0001.swish-e
-rw-r--r--   1 nhuillar nhuillar  7326744 Oct  7 15:16 bench-01-2501.swish-e
-rw-r--r--   1 nhuillar nhuillar  8017522 Oct  7 17:12 bench-01-5001.swish-e
-rw-r--r--   1 nhuillar nhuillar  4972033 Oct  7 18:54 bench-01-7501.swish-e

and bench-02.swish-e should be merged from bench-01.swish-e and :
-rw-r--r--   1 nhuillar nhuillar  7872284 Oct  7 13:43 
Received on Fri Oct 8 08:05:07 1999