It *looks* like if I use chr(???) with a ??? <= 255 I will always get the single byte I am looking for (i.e. not translated to/from some type of unicode symbol set). Correct ?
Well... in a way... yes. But you're overlooking one thing: if Perl concatenates a UTF8 string with a Latin-1 string (at least, that's the only way to think about it that makes sense), Perl will convert the Latin-1 string to UTF-8. Let me show you with an example:
($\, $,) = ("\n", " "); # set up output mode $string = "A" . chr(180) . "B"; # Latin-1 print unpack "C*", $string; $string .= chr(367); # UTF-8 print unpack "C*", $string;
Output:
65 180 66
65 194 180 66 197 175
As you can see, the original chr(180), between chr(65) ("A") and chr(66) ("B") is converted to UTF-8, rsulting in two bytes.

So, if you want UTF-8, all you have to do is insert the characters into a UTF-8 string, or concatenate it with a UTF-8 string. That may even be a zero-length string, asq returned by pack "U0":

($\, $,) = ("\n", " "); # set up output mode $string = "A" . chr(180) . "B"; # Latin-1 print unpack "C*", $string; $string .= pack "U0"; # zero length, UTF-8 print unpack "C*", $string;
Result:
65 180 66
65 194 180 66

p.s. This was tested with perl 5.6.1. on Windows. Not that it matters much — it shouldn't, except that you need at least perl 5.6.


In reply to Re: Re: Re: Why is variable interpolation suppressed in \x{$xxx} replacement ? by bart
in thread Why is variable interpolation suppressed in \x{$xxx} replacement ? by Anonymous Monk

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.