Skip to main content.
home | support | download

Back to List Archive

RE: HTTP Indexing Times for Different OSs and Swish Ver

From: Deane Barker <deane.barker(at)not-real.bankfirstcorp.com>
Date: Fri Jan 11 2002 - 21:09:12 GMT
Oh-h-h-h-h, there's a delay between requests...that would make sense.  19
pages...21 minutes..it all makes sense now.  :-)

In my defense, however, this delay does not appear on Windows machines.  I
loaded it on two Windows machines before trying it on Linux, and there is NO
delay between page requests even though I never set it off the default.
That's why I wasn't looking for it on the Linux machine.

Thanks...my face is red.

Deane

-----Original Message-----
From: Bill Moseley [mailto:moseley@hank.org] 
Sent: Friday, January 11, 2002 3:03 PM
To: deane.barker@bankfirstcorp.com; Multiple recipients of list
Subject: Re: [SWISH-E] HTTP Indexing Times for Different OSs and Swish
Versions


At 12:12 PM 01/11/02 -0800, Deane Barker wrote:
>Machine A:  Athlon 750 Mhz, 224MB RAM running Mandrake Linux 8.1  ("SWISH-E
>2.0") 
>
>Machine B:  Athlon 1 GHz, 384MB RAM running Windows XP Home  ("SWISH-E
>2.1-dev-24") 

Those are not the same version of swish.  2.0.x is MUCH slower than
2.1-dev-24.

Also, indexing for three seconds is not a very good test, either.  


If you have the same 30,000 files on Windows and on Linux then it might be
easier to compare.  I'm sure you won't see much difference between linux
and windows with the same hardware.

Yesterday I tried indexing 24,000 files in my /usr/doc with 2.0.5,
2.1-dev-20 (basically same), and current dev version.  Current version took
4 minutes.  Others never finished after over an hour (those version use a
lot more RAM so my machine was swapping).

Now, comparing spidering?  Good luck.

>
>Here's where it gets interesting: I set up the swishspider and unleashed
>them both on the same web site (very small -- just 19 unique pages) via
HTTP
>crawl at the same general time (one just after another, late at night when
>volume was low; web server logs indicate that the spider was the only
active
>session on the web site at the time).
>
>The time differences were massive: 
>
>Machine A (Linux / Swish 2.0):  21 minutes  (that's MINUTES, not
seconds...)
>
>Machine B (Windows / Swish 2.1-dev-24):  14 seconds

I think you forgot to set the delay to zero.  By default, using -S http,
swish waits one minute between requests.


-- 
Bill Moseley
mailto:moseley@hank.org
Received on Fri Jan 11 21:09:46 2002