Skip to main content.
home | support | download

Back to List Archive

Re: maximum size of files

From: David L Norris <dave(at)not-real.webaugur.com>
Date: Wed Jan 07 2004 - 05:24:48 GMT
On Tue, 2004-01-06 at 23:31, venkatesh ramanathan wrote:
> i am using Swish-e version 2 what can be the maximum
> sizeit can index upto. what will be the size of the
> index file for that size.

All released versions of SWISH-E support index files up to 2 GB.

The index file size will depend completely on what you choose to store
in the index.  My only advice is to test SWISH-E with your data.

You must recreate your index to add new data.  Indexing a large set of
documents will take a very long time.  On large collections of dynamic
data this will be impossible to manage.  It may be well suited for large
document archives which will never change.  (e.g. books, magazines, news
and email archives, etc)



Current SWISH-E development (2.5.x) should support files up to 18
exabytes on Linux and UNIX systems.  (SWISH-E on Win32 and Win64 should
eventually support 9 exabyte files.)  Storage space and the time
required to create an index will be primary limiting factors.

The largest index I know about was Jose's recent test with 3 GB index
and 5 GB prop files under Red Hat Linux 9.  That test simulated 1.3
million documents composed of 500-1000 random words from a dictionary.=20
The indexing ran 3 hours and search results return in 0.2 seconds.

I have attempted to create large index files on Win32 using Jose's test
script and achieved up to a 2 GB index with 3.5 GB prop files.  Some
issues remain with creating an index file over 2 GB.

--=20
 David Norris
  http://www.webaugur.com/dave/
  ICQ - 412039



*********************************************************************
Due to deletion of content types excluded from this list by policy,
this multipart message was reduced to a single part, and from there
to a plain text message.
*********************************************************************
Received on Wed Jan 7 05:24:57 2004