in reply to clarification on binmode STDOUT
When you add the output layer, you tell print to convert from characters to bytes. So the Ç is interpreted as a series of characters, and defaults to Latin-1 encoding. Its UTF-8 bytes are 0xc3 0x87, and that is interpreted as U+00C3 LATIN CAPITAL LETTER A WITH TILDE, U+0087 <control>, so what you'll see for the first character is Ç
This can be solved by also adding the line use utf8; to your program, telling it that string literals should be decoded as utf-8.
I tried to describe Perl's Unicode model in this article, I hope it will help you understanding what's going on.
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re^2: clarification on binmode STDOUT
by rmflow (Beadle) on Jul 01, 2009 at 14:53 UTC |