Skip to main content.
home | support | download

Back to List Archive

Re: [swish-e] Seeding a swish-e index

From: Bill Moseley <moseley(at)not-real.hank.org>
Date: Fri Sep 05 2008 - 15:19:41 GMT
On Fri, Sep 05, 2008 at 04:20:38PM +1000, Sean wrote:
> I use swish-e in an intranet environment using file system indexing.
> 
> There are a number of off-site resources that are not currently indexed.
> 
> Users have indicated that they would like to be able to search for these off-site resources.
> 
> I have managed to achieve this by adding .html files in the file system path for each of these 
> off-site resources, like the following:
> 
> ____
> 
> <html>
> <head>
> <meta http-equiv="Refresh" content="0; url=http://off.site.url/">

Swish won't follow that refresh, if that's what you had in mind.

Your options are to spider the other site, or to maintain a list of
URLs that you want to index and write a program to fetch those while
indexing.

> Can anyone suggest how to ensure or improve the ranking of such a
> file in the search engine results ?

Swish-e's ranking is not very sophisticated.  Ranking is mostly based
on number of words matched and a little on where in the document the
matches happens.  If you want some pages to alway rank higher than
others you could add some property to each file and then sort first by
that property and then use swish-e's rank as the secondary sort.

-- 
Bill Moseley
moseley@hank.org

Unsubscribe from or help with the swish-e list: 
   http://swish-e.org/Discussion/

Help with Swish-e:
   http://swish-e.org/current/docs

_______________________________________________
Users mailing list
Users@lists.swish-e.org
http://lists.swish-e.org/listinfo/users
Received on Fri Sep 5 11:19:43 2008