geektron has asked for the wisdom of the Perl Monks concerning the following question:

i've been digging around and can't find anything quickly useful ( i *should* have this finishing by lunch localtime ).

Foreign language characters... had some helpful ideas, but i *think* the strings in which i need to tr// has accented capitals, and i can't get tr// to work.

here's the beginning of the code:

## $fixedName is fetched from DB and copied to new var my $fixedName = "MATíAS LóPEZ"; $fixedName =~ tr/íÍ/II/; print "FIXED: $fixedName \n";

problem is ... tr// doesn't catch anything. i get:

FIXED: MATíAS LóPEZ

i've tried to 'downgrade' the characterset and gotten lots of "?" characters. I've tried Unicode::String and that didn't do anything. i tried  Encode.pm and gotten lots of binary-looking cahracters ...

any advice or pointers?

Replies are listed 'Best First'.
Re: ISO-8859-1 characters to ASCII
by ysth (Canon) on Feb 25, 2004 at 16:44 UTC
      thanks. turns out your version of removing diacritics, plus a bit of  uc, fixed it.
Re: ISO-8859-1 characters to ASCII
by Tomte (Priest) on Feb 25, 2004 at 16:46 UTC

    Seems to be a stranger problem than you think, as tr works with german umlauts and other iso-8859-1 characters without a problem:

    tom@margo tom $ perl -e 'my $test="äöüßæ"; $test =~ tr/öäüßæ/oausä/; p +rint $test, "\n"' aousä

    regards,
    tomte


    Hlade's Law:

    If you have a difficult task, give it to a lazy person --
    they will find an easier way to do it.

Re: ISO-8859-1 characters to ASCII
by ambrus (Abbot) on Feb 25, 2004 at 17:00 UTC

    Try this (untested).

    s@\xA0@NS@g; s@\xA1@!I@g; s@\xA2@Ct@g; s@\xA2@!C@g; s@\xA3@L-@g; s@\xA3@Pd@g; s@\xA4@Cu@g; s@\xA4@Xo@g; s@\xA5@Y-@g; s@\xA5@Ye@g; s@\xA6@BB@g; s@\xA6@!B@g; s@\xA7@SE@g; s@\xA8@':@g; s@\xA9@Co@g; s@\xAA@-a@g; s@\xAB@<<@g; s@\xAC@NO@g; s@\xAC@7!@g; s@\xAD@--@g; s@\xAE@Rg@g; s@\xAF@'m@g; s@\xB0@DG@g; s@\xB1@+-@g; s@\xB2@2S@g; s@\xB3@3S@g; s@\xB4@''@g; s@\xB5@My@g; s@\xB6@PI@g; s@\xB6@9I@g; s@\xB7@.M@g; s@\xB8@',@g; s@\xB9@1S@g; s@\xBA@-o@g; s@\xBB@/>/>@g; s@\xBC@14@g; s@\xBD@12@g; s@\xBE@34@g; s@\xBF@?I@g; s@\xC0@A!@g; s@\xC1@A'@g; s@\xC2@A/>@g; s@\xC3@A?@g; s@\xC4@A:@g; s@\xC5@AA@g; s@\xC6@AE@g; s@\xC7@C,@g; s@\xC8@E!@g; s@\xC9@E'@g; s@\xCA@E/>@g; s@\xCB@E:@g; s@\xCC@I!@g; s@\xCD@I'@g; s@\xCE@I/>@g; s@\xCF@I:@g; s@\xD0@D-@g; s@\xD1@N?@g; s@\xD2@O!@g; s@\xD3@O'@g; s@\xD4@O/>@g; s@\xD5@O?@g; s@\xD6@O:@g; s@\xD7@*X@g; s@\xD8@O//@g; s@\xD9@U!@g; s@\xDA@U'@g; s@\xDB@U/>@g; s@\xDC@U:@g; s@\xDD@Y'@g; s@\xDE@TH@g; s@\xDF@ss@g; s@\xE0@a!@g; s@\xE1@a'@g; s@\xE2@a/>@g; s@\xE3@a?@g; s@\xE4@a:@g; s@\xE5@aa@g; s@\xE6@ae@g; s@\xE7@c,@g; s@\xE8@e!@g; s@\xE9@e'@g; s@\xEA@e/>@g; s@\xEB@e:@g; s@\xEC@i!@g; s@\xED@i'@g; s@\xEE@i/>@g; s@\xEF@i:@g; s@\xF0@d-@g; s@\xF1@n?@g; s@\xF2@o!@g; s@\xF3@o'@g; s@\xF4@o/>@g; s@\xF5@o?@g; s@\xF6@o:@g; s@\xF7@-:@g; s@\xF8@o//@g; s@\xF9@u!@g; s@\xFA@u'@g; s@\xFB@u/>@g; s@\xFC@u:@g; s@\xFD@y'@g; s@\xFE@th@g; s@\xFF@y:@g;