in reply to Re^2: Help with Accented Characters
in thread Help with Accented Characters
To get a proper hexdump, use unpack instead of ord:
print join " ", unpack("(H2)*", $s);
When making that change to your code (in addition to use utf8; — as ysth correctly pointed out already), I'm getting output as I would expect (presuming the source file has been composed with a UTF8 editor).
$ ./669879.pl The string is: 'Resume' 52 65 73 75 6d 65 Uppercase: 'RESUME' Lowercase: 'resume' length = 6 bytes = 6 The string is: 'Résumé' 52 c3 a9 73 75 6d c3 a9 Uppercase: 'RÉSUMÉ' Lowercase: 'résumé' length = 6 bytes = 8
(I've converted the é/É chars in the output to Isolatin, for the PM web frontend to display them properly... But as the hexdump shows, they're internally encoded as c3 a9 (UTF8))
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re^4: Help with Accented Characters
by Anonymous Monk on Feb 10, 2012 at 15:39 UTC | |
| A reply falls below the community's threshold of quality. You may see it by logging in. |