Skip to main content.
home | support | download

Back to List Archive

ReplaceRules not working as advertised

From: Colin Kuskie <ckuskie(at)not-real.sterlink.net>
Date: Mon Apr 22 2002 - 18:43:03 GMT
Greetings, all!

I'm using swish-e-2.1-dev-25 on a Red Hat 6.2 box, zlib but no libxml2.

I found that I was getting "duplicate" results when indexing:

1000 http://www.sunsetpres.org/Men/ "Sunset Presbyterian Men's Ministry Page" 29670
1000 http://www.sunsetpres.org/Men/index.html "Sunset Presbyterian Men's Ministry Page" 29670
911 http://www.sunsetpres.org/Worship/ "" 16778
911 http://www.sunsetpres.org/Worship/index.html "" 16778

So I added a ReplaceRule to remove the index.html:

ReplaceRules remove "index.html"

Based on reading the docs, I expected it to merge the results for
the two URLs, since they say:

           ReplaceRules allows you to make changes to file path­
           names before they're indexed.  These changed file
           names or URLs will be returned in search results.

Thus the indexer shouldn't be able to tell the difference.  However,
that's not the new indexing run shows:
1000 http://www.sunsetpres.org/Men/ "Sunset Presbyterian Men's Ministry Page" 29670
1000 http://www.sunsetpres.org/Men/ "Sunset Presbyterian Men's Ministry Page" 29670
911 http://www.sunsetpres.org/Worship/ "" 16778
911 http://www.sunsetpres.org/Worship/ "" 16778

Now I get duplicate results.  Should I try a later development version
to see if the behavior has changed or different config option, or is
this a legitimate bug?

Colin
Received on Mon Apr 22 18:43:08 2002