not sure if this helps, but what we do is:
## fix non utf8 stuff
$res->{$_} =~ s/([^\x00-\x7F])/'&#' . ord($1) . ';'/gse;
## swap out common euro characters to english version search letters
if ($res->{$_} =~ /\&\#/){
my $to_eng = $res->{$_};
$to_eng =~ s/\&\#246\;/o/g;
$to_eng =~ s/\&\#214\;/O/g;
$to_eng =~ s/\&\#233\;/e/g;
$to_eng =~ s/\&\#232\;/e/g;
$to_eng =~ s/\&\#200\;/E/g;
$to_eng =~ s/\&\#201\;/E/g;
$to_eng =~ s/\&\#209\;/N/g;
$to_eng =~ s/\&\#241\;/n/g;
$to_eng =~ s/\&\#220\;/U/g;
$to_eng =~ s/\&\#252\;/u/g;
}
## append to the keywords
$res->{$_} .= " ".$to_eng;
sorry if i am missing your needs entirely.
Brad
--------------------------------------------
Brad Miele
Director of Technology
rumblefish
919 SW Taylor Suite 300
Portland, OR, 97205, Earth
url: http://www.rumblefish.com
email/aim: brad@rumblefish.com
vox: 503-248-0706
On Aug 8, 2008, at 2:31 PM, Michael Peters wrote:
> amscopub-pcshop@yahoo.com wrote:
>> If you are using international characters, why don't you remove the
>> accents instead?
>>
>> For example, change the Spanish "se~nor" to "senor".
>
> That's what I'll do if there is no other option. But that doesn't help
> with things like chinese or hebrew characters. I have to deal with all
> of them, not just modified ascii chars.
>
> --
> Michael Peters
> Plus Three, LP
>
> _______________________________________________
> Users mailing list
> Users@lists.swish-e.org
> http://lists.swish-e.org/listinfo/users
_______________________________________________
Users mailing list
Users@lists.swish-e.org
http://lists.swish-e.org/listinfo/users
Received on Fri Aug 8 17:42:32 2008