Skip to main content.
home | support | download

Back to List Archive

Re: Differences

From: Nathan Vonnahme <nathan.vonnahme(at)not-real.bannerhealth.com>
Date: Fri Jun 27 2003 - 18:54:31 GMT
If your content is in some sort of database, it seems to me the easier approach would be to feed the swish index just the new content every day, if you can.  Either with a script that queries the database directly, or by creating an alternative version of the site where only the new stuff is displayed, then spidering it and translating the urls to the real site.

Or have your layout code automatically put <noindex> tags around old sections, that would save having to keep two copies and compare them.   If you use flat files, you could use the diff tool to compare the different files and feed only the additions to swish.

Anyway, it seems more straightforward to limit what swish is paying attention to when you do the indexing, rather than trying to build the newness sensing into the search side of things.

-n

--
nathan vonnahme  :  programmer/analyst  :  nathan.vonnahme@bannerhealth.com
fairbanks memorial hospital/denali center  :  1650 cowles st, fairbanks alaska 99701
voice 907.458.5464  :  fax 907.458.5030  


>>> John Almberg <jalmberg@identry.com> 06/27/03 07:27AM >>>
I've used SWISH on a number of sites and think its great.

For a new site, I've got a bit of a twist...

Every day, the site will change a bit -- new data items will be added to 
various pages. What I need to do to search this site for keywords and 
obtain a result set that contains only NEW results.

For example, if there is a page that contains the word 'Linux' in paragraph 
3, and then the page is updated with a new paragraph that also contains the 
word 'Linux', then 	-a search conducted on the day of the change will retun 
one result -- the new instance of 'Linux'
	-a search conducted on any subsequent day will return zero results (unless 
a new paragraph is added.)

Is there any in-built feature of SWISH that will help me here? The only way 
I can think to do this is to save the previous day's results, run the query 
on the next day, and check for differences. However, in the real-life 
situation, there will be tens of thousands of these queries and storing 
results will be pretty inefficient.

Any help, much appreciated!

-- John

-------------------------
Identry, LLC
Northport, NY 11768

Phone: 631.754.8440
Fax: 631.980.4262
Email: jalmberg@identry.com 
Web: www.identry.com 

Member: ASDA, APS, ANA, Long Island Web Developers Guild

<><><><><><><><><><><><><><><><><><><><><><><><>
    Read the latest issue of SELLING ON THE WEB
    at http://www.identry.com 
<><><><><><><><><><><><><><><><><><><><><><><><>
Received on Fri Jun 27 18:54:36 2003