One of the nasty interactions you can have with so many components (editor, programming language, terminal) working together to interpret and re-interpret octet sequences :) As quester has already explained how to fix it, this is why:

In the first program Perl doesn't interpret much about the embedded byte string. It sees it's 10 bytes long and writes the raw bytes to STDOUT were the terminal picks them up, recognizes they're valid UTF-8 and displays them thus. use utf8 on the other hand tells Perl to interpret the 10 bytes as UTF-8 characters so it realizes it's actually a string of only 5 characters. But when printing them, it forces them back to Latin-1 unless you use the binmode() which the terminal, expecting UTF-8, cannot parse correctly.

Edit: you were unlucky in that é is a valid character in Latin-1 so it can be "downgraded" to 1-byte encoding without complaints. Had you had some character outside that range in your string (\x{123} works fine) you'd have gotten a "Wide character in print" warning.


In reply to Re: minimalist perl-utf8 question by mbethke
in thread minimalist perl-utf8 question by didess

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.