Well, the Dump outputs show that the function is correctly returning the unicode character 0xa5; it's just that the internal encoding happens not to be utf8. Using utf8::upgrade gets round whatever problem you're having because it converts the internal representation.

The problem must lie in how you're using the returned value. If for example you're just printing it to STDOUT, and if whatever's listening on STDOUT expects utf8 encoding (eg the terminal), then you need to let Perl know that any output on that file handle should be utf8 encoded, eg

$ perl -e 'print chr 0xa5'|od -x 0000000 00a5 $ perl -e 'binmode(STDOUT, ":utf8"); print chr 0xa5'|od -x 0000000 a5c2 $
see perluniintro (in 5.8.x) for more information.

Dave.


In reply to Re^3: UTF8/Unicode Confusion by dave_the_m
in thread UTF8/Unicode Confusion by jk2addict

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.