Skip to main content.
home | support | download

Back to List Archive

[swish-e] help with accented characters

From: Anthony Sheetz <sheetzam(at)>
Date: Wed Oct 24 2007 - 20:56:08 GMT
I am working on using swish-e and the included spider to index a web  
site which uses the occasional accented character.  The most common  
one is the acute e, or in html: &eacute;. A specific example is  
So, from what I can tell, the html encoding of entre&eacute;e is  
being stored as entrée (the accented e translated to its proper  
encoding) in the database.  I can then search for entree (unaccented  
e) and get results that had the html encoded entre&eacute;e.

However, and this is the problem I need to solve:  the results are  
returned as entrée (the accented e translated to its proper encoding)  
rather than the html encoded entre&eacute;e.  I need to have the text  
as it was originally presented, not as it was translated. What is the  
best way to do this?

I am using SWISH-E 2.4.4 on Gentoo Linux.
In my config I have set TranslateCharacters :ascii7:
Anything else you need to know?

Thanks in advance for any help.
Anthony Sheetz
Zurka Interactive

Users mailing list
Received on Wed Oct 24 16:56:20 2007