On Sat, Jan 19, 2008 at 01:15:03PM +0000, Kevin Porter wrote:
> I've somehow ended up with a few duplicates in my index, and need to
> remove them, or filter them out of the search results. Before
> implementing it on the web front-end side, I'd like to know if it's
> possible to filter them out with a command line option to swish-e, or to
> remove them totally? The problem URLs contain the string
> "widgetType=BlogArchive". I'm not even sure if swish-e matches terms
> against the URL, or can be made to.
If you use
then the path will be indexed. So then you could likely
filter on that string.
If you want finer control check out ExtractPath.
But, both of those would require re-indexing so in that case you might
as well not index the files you don't want to include in the index.
Unsubscribe from or help with the swish-e list:
Help with Swish-e:
Users mailing list
Received on Sat Jan 19 09:53:48 2008