Skip to main content.
home | support | download

Back to List Archive

Re: [swish-e] Question and Who Uses It Page

From: at <Peter>
Date: Tue, 13 Nov 2012 22:48:43 -0600
lisab(at) wrote on 11/13/12 3:03 AM:
> Hello, We've been using Swish on one of our servers but I'm new to  
> Swish and we will be adding in resume and healthcare policy  
> information searching capabilities. I'm not familiar with indexing  
> speed requirements when adding larger data sets and was wondering if  
> anyone could give me guidelines on how long it may take to index close  
> to  500,000 new documents? My plan was to try to schedule the indexing  
> on off peak hours but if I had some idea how long that a typical index  
> would take with adding in those files then that would be helpful to me  
> as I try to get up to speed.

The time-to-index will depend on the size of the documents and how many fields
(MetaNames and PropertyNames) you have defined. Disk I/O is the big bottleneck IME.

I'd suggest profiling your doc set with a smaller number and then extrapolate.

> Also, we'd be honored as well to be included in the users page.  
> Company is and we provide Hospital Ratings, Reviews  
> and HealthCare Information

thanks. added in r3251. Should appear on the site in the next few hours.

> I also have to learn how to index Power Point documents but that will  
> be for another night's work. Thank you and your tips are most  
> appreciated.

SWISH::Filter::pp2html and SWISH::Filter::pp2txt both claim to handle PowerPoint
docs. Check the docs for examples.

Peter Karman  .  .  peter(at)
Users mailing list
Received on Wed Nov 14 2012 - 04:50:45 GMT