Skip to main content.
home | support | download

Back to List Archive

Re: suggestions for large multi-server indexing?

From: <D.J.Adams(at)not-real.soton.ac.uk>
Date: Mon Jun 12 2000 - 11:34:44 GMT
> 
> I'd like to hear your suggestions for doing large-scale multi-server
> indexing with swish-e.
> 
> In particular:
> 
> (1) What are the are the pros and cons of doing a single big index
> (giving it starting URLs across all servers) vs. doing a number of
> small indexes and merging them?
> 
> (2) What are issues likely to cause problems in scaling up?
> 
> (3) How large are some indexes that people have created sucessfully,
> and what hardware/time does it take to do it?
> 
> The case I'm interested in is creating a campus-wide index of the
> semi-official servers at our university.
> 
> No one knows exactly how much is out there to index, but rough
> guesses suggest 200-300 servers, with something like 100,000 -
> 200,000 HTML pages.
> ---
>      Albert Lunde                      Albert-Lunde@nwu.edu
> 

Swish-e has its strengths, but for a index of this magnitude I would
recommend ht://Dig (http://www.htdig.org).

-- 
 
David J Adams
<D.J.Adams@soton.ac.uk>
Computing Services
University of Southampton
Received on Mon Jun 12 07:37:25 2000