Skip to main content.
home | support | download

Back to List Archive

Re: [swish-e] index a list of files

From: Brad Bauer <bbauer(at)>
Date: Wed Jul 09 2008 - 03:24:19 GMT
Sorry, my email client is not indenting when I reply.

I understand what you mean about separate indexes now.  Back to my original
question: is there is a way to feed swish-e a specific list of local files
to index?   We are having a problem where pdfs we don't want indexed get
indexed, so I would like to only index pdfs that have links to them (the
list I gather while spidering).  I am dealing with hundreds of pdfs, so its
not always easy to spot and remove these.

Thanks for your replies,

B Bauer

-----Original Message-----
[] On Behalf Of Peter Karman
Sent: Tuesday, July 08, 2008 11:14 PM
To: Swish-e Users Discussion List
Subject: Re: [swish-e] index a list of files

Brad Bauer wrote on 7/8/08 9:34 PM:
> How hard is it to update from pre 2.4?  I got the impression it would 
> require quite a bit of rework to get our customizations recreated.

It depends on your customizations.

I moved from 2.2 to 2.4 back in 2003 when 2.4 came out, but I had only been
using 2.2 a short time. IIRC, the swish-e config was mostly portable, but
there were some significant changes to the library API.

> I am using -S prog with


> RE: Caching - I am attempting to avoid downloading pdfs since it is 
> very time consuming compared to the fs method. (They do, after all, 
> already exist on the server)  Using the spider is taking 20+ minutes 
> for only a small section of the site, where as using the fs setup I am 
> able to index the entire server in about 5 minutes.

that makes sense. I would take the approach I suggested before; skip the
PDFs via, create one index of PDFs, one of spidered content, and
then merge them.

Users mailing list

Users mailing list
Received on Tue Jul 8 23:24:16 2008