Skip to main content.
home | support | download

Back to List Archive

RE: Removing duplicates from merged index

From: Tac Tacelosky/Smokefree DC <tac(at)not-real.cheztac.com>
Date: Thu Dec 22 2005 - 23:31:19 GMT
Actually, the files are xml files with lots of items, so it's not file
with filenames, but individual records with the same field (item_code). 

-----Original Message-----
From: Bill Moseley [mailto:moseley@hank.org] 
Sent: Thursday, December 22, 2005 5:46 PM
To: Tac Tacelosky/Smokefree DC
Cc: Multiple recipients of list
Subject: Re: Removing duplicates from merged index

On Thu, Dec 22, 2005 at 02:40:17PM -0800, Tac Tacelosky/Smokefree DC
wrote:
> I'm trying to think of a way to remove duplicate items from a merged 
> index.  Any suggestions?  Each item has a unique item_code in the 
> individual index, but when merged, I'd like to get rid of duplicates 
> (defined as having the same item_code).

They have the same file name?  Merge should remove duplicate files based
on the file name.  It looks at the swishlastmodifed to figure out which
one is newer.

--
Bill Moseley
moseley@hank.org

Unsubscribe from or help with the swish-e list: 
   http://swish-e.org/Discussion/

Help with Swish-e:
   http://swish-e.org/current/docs
   swish-e@sunsite.berkeley.edu
Received on Thu Dec 22 15:31:20 2005