Damn right I disagree. It is not Perl's problem that someone using a function that's documented to check for accidental double-encoding to check if something is valid UTF-8. That's akin to using uc to get the first character of a string. There's nothing Perl can do to stop you from using a function completely unrelated to the one you want to use.

This is the second time this thread you've implied that I maintain that Perl's handling of UTF-8 isn't confusing. That's a lie. The former bugs in Perl (some still present) and the plethora of buggy XS module (because XS is hard!) has led people like you to disseminate misinformation, which has created a self-feeding vicious loop of confused people. I've repeatedly said that Perl should be able to differentiate encoded strings from decoded strings and prevent you from mixing them.

Speaking of misinformation, improper upgrading doesn't cause double-encoding. Quite the opposite, it causes a string encoded using UTF-8 to become decoded. (Upgrading a strings that isn't encoded using UTF-8 creates a corrupt scalar, as seen using perl -MDevel::Peek -MEncode=_utf8_on -we"$_ = qq{\x80}; _utf8_on($_); Dump($_)")

Quickly, tell me, what that actually means?

Double encoding is doing encode_utf8(encode_utf8($x)) when you mean to do encode_utf8($x).


In reply to Re^4: JSON::XS (and JSON::PP) appear to generate invalid UTF-8 for character in range 127 to 255 by ikegami
in thread JSON::XS (and JSON::PP) appear to generate invalid UTF-8 for character in range 127 to 255 by Ovid

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.