On 08/08/2008 04:49 PM, Michael Peters wrote: > Brad Miele wrote: >> not sure if this helps, but what we do is: > > Mine is simpler and just 1 line: > > $buffer =~ s/([^\p{IsASCII}])/sprintf('&#x%X;', ord($1))/ge; > I wrote: http://search.cpan.org/~karman/Search-Tools-0.17/lib/Search/Tools/XML.pm#utf8_safe(_string_) for just such cases as needing to store UTF-8 encoded text as a Swish-e Property. I think \p{IsASCII} requires the double encoding of & -> & because \p works on characters, not bytes. It'll work (the double-encoding approach) just as well as the Search::Tools hack does, but for different reasons. -- Peter Karman . peter(at)not-real.peknet.com . http://peknet.com/ _______________________________________________ Users mailing list Users@lists.swish-e.org http://lists.swish-e.org/listinfo/usersReceived on Mon Aug 11 12:18:10 2008