Beefy Boxes and Bandwidth Generously Provided by pair Networks
Just another Perl shrine
 
PerlMonks  

Re: Playing with "funny" chars

by mischief (Hermit)
on Sep 27, 2004 at 12:57 UTC ( [id://394148]=note: print w/replies, xml ) Need Help??


in reply to Playing with extended chars

You might also want to look at Text::Iconv.

Replies are listed 'Best First'.
Re^2: Playing with "funny" chars
by itub (Priest) on Sep 27, 2004 at 13:12 UTC
    (oops, I wanted to reply to the first post but clicked here by accident ;) ).

    My recommendation is to use perl 5.8.0 or more recent and look at perldoc Encode, perldoc open, and perldoc -f open. If tr doesn't work because you have the characters encoded in two bytes, you can do

    $s = decode_utf8($s);

    That will convert the string into the internal representation where characters are characters and you don't have to worry about how many bytes they need for encoding.

      I think the problem is not on the string (I'm using perl5.8.5, because 5.8.0 had some bugs in RedHat), but on the tr operator itself.

      The first attemp works like this:

      perl -e '$_="áéíóú";tr/áéíóú/aeiou/;print' aeaoauauau
      It seems that "á" is treated as two characters, maybe "´" and "a", and each one get one different matching char ( "a" and "e").

      BTW, encode and decode functions return values that make me think that the string is well formed, and that is tr// who's making wrong things. Am I too lost?

        If you have utf8 encoded strings in your program file, you need to use the utf8 pragma (see perldoc utf8).

        use utf8; $s = 'holáéíóúon'; $s =~ tr/áéíóú/aeiou/; print $s; # prints holaeiouon
        The code above may show the double characters explicitly since perlmonks.org is served as ISO-8859-1.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://394148]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others pondering the Monastery: (4)
As of 2024-04-18 03:30 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found