Skip to main content.
home | support | download

Back to List Archive

Re: [swish-e] Question and Who Uses It Page

From: at <Peter>
Date: Tue, 13 Nov 2012 22:48:43 -0600
lisab(at)not-real.hospitalsoup.com wrote on 11/13/12 3:03 AM:
> Hello, We've been using Swish on one of our servers but I'm new to  
> Swish and we will be adding in resume and healthcare policy  
> information searching capabilities. I'm not familiar with indexing  
> speed requirements when adding larger data sets and was wondering if  
> anyone could give me guidelines on how long it may take to index close  
> to  500,000 new documents? My plan was to try to schedule the indexing  
> on off peak hours but if I had some idea how long that a typical index  
> would take with adding in those files then that would be helpful to me  
> as I try to get up to speed.

The time-to-index will depend on the size of the documents and how many fields
(MetaNames and PropertyNames) you have defined. Disk I/O is the big bottleneck IME.

I'd suggest profiling your doc set with a smaller number and then extrapolate.

> 
> Also, we'd be honored as well to be included in the users page.  
> Company is HospitalSoup.com and we provide Hospital Ratings, Reviews  
> and HealthCare Information

thanks. added in r3251. Should appear on the site in the next few hours.


> 
> I also have to learn how to index Power Point documents but that will  
> be for another night's work. Thank you and your tips are most  
> appreciated.
> 

SWISH::Filter::pp2html and SWISH::Filter::pp2txt both claim to handle PowerPoint
docs. Check the docs for examples.


-- 
Peter Karman  .  http://peknet.com/  .  peter(at)not-real.peknet.com
_______________________________________________
Users mailing list
Users(at)not-real.lists.swish-e.org
http://lists.swish-e.org/listinfo/users
Received on Wed Nov 14 2012 - 04:50:45 GMT