(oops, I wanted to reply to the first post but clicked here by accident ;) ).
My recommendation is to use perl 5.8.0 or more recent and look at perldoc Encode, perldoc open, and perldoc -f open. If tr doesn't work because you have the characters encoded in two bytes, you can do
$s = decode_utf8($s); That will convert the string into the internal representation where characters are characters and you don't have to worry about how many bytes they need for encoding. | [reply] [d/l] |
perl -e '$_="áéíóú";tr/áéíóú/aeiou/;print'
aeaoauauau
It seems that "á" is treated as two characters, maybe "´" and "a", and each one get one different matching char ( "a" and "e").
BTW, encode and decode functions return values that make me think that the string is well formed, and that is tr// who's making wrong things. Am I too lost?
| [reply] [d/l] [select] |
If you have utf8 encoded strings in your program file, you need to use the utf8 pragma (see perldoc utf8).
use utf8;
$s = 'holáéÃóúon';
$s =~ tr/áéÃóú/aeiou/;
print $s;
# prints holaeiouon
The code above may show the double characters explicitly since perlmonks.org is served as ISO-8859-1. | [reply] [d/l] |