Skip to main content.
home | support | download

Back to List Archive

Re: Lucene vs Swish-e

From: Dobrica Pavlinusic <dpavlin(at)not-real.rot13.org>
Date: Thu Jan 20 2005 - 20:40:39 GMT
On Thu, Jan 20, 2005 at 11:51:42AM -0800, Peter Karman wrote:
> Anyone done any benchmarking lately of Swish-e vs Lucene (or Nutch)? I'm 
> just curious...

I have done some benchmarks against Lucene and Clucene in past, and
swish is much faster (once I made note that Xapian is order of magnitude
slower than swish-e and got e-mail from Olly [Xapian developer] which
forced me to re-run tests and notice only double slowdown).

I am in process of benchmarking different search engines (it's not easy
because I have to take into account differences in query language and
special rules about what is indexed). For now, I have tested swish-e
(using old and new incremental file format), Xapian and mg (Managing
Gigabytes) and have following numbers to report:

			real	user	sys
swish-e incremental	0.357	0.027	0.023
swish-e			0.327	0.057	0.036
xapian omega		0.458	0.029	0.016
mg			0.327	0.022	0.035

It's a search query for "full AND text AND search" on my test corpus of

316,817 unique words indexed.
62,544 files indexed.  691,244,737 total bytes.  55,650,106 total words.
Elapsed time: 00:11:32 CPU time: 00:05:03

as reported by swish-e. Tests where done 6 times, and before first
search memory caches where flush, so first search is much longer than
following ones (numbers above are averages).

I will write detailed report when I finish and post link to this list if
there is interest.

-- 
Dobrica Pavlinusic               2share!2flame            dpavlin@rot13.org
Unix addict. Internet consultant.             http://www.rot13.org/~dpavlin
Received on Thu Jan 20 12:40:39 2005