It doesn't say "Wide character".

Specific error message aside, Perl should never treat a number as a 'wide character' without explicit notification from the programmer that that is his intent.

c:\test>perl -we"print chr( 257 )" | wc -c Wide character in print at -e line 1. 2
I've already pointed out the documentation is wrong.

No! You didn't. Nowhere prior to this post anywhere in this thread.

There is no such thing as Unicode number 0x20000, yet

So, the documentation is wrong! And the implementation is (silently) wrong!

That pretty much covers everything. Unicode support in perl is broken.

In Perl, a character is a number in 0 to UVMAX.

And that bullshit is exactly why it is so broken.

Because &^*&% like you will keep on conflating 'numbers' with 'characters'.

  1. UVMAX is cpu dependant.

    Typically 4294967296 or 18446744073709551616, but with other values possible.

  2. The term 'character' has no meaning outside of some mapping.

    Unless a number can be mapped to a grapheme, grapheme-like unit, or symbol, such as in an alphabet or syllabary in the written form of a natural language., it is just a number.

    And even when it can be so mapped, until it is mapped, it is still just a number.

    And any suggestion otherwise is just so much bullshit.

  3. And 4294967296, much less 18446744073709551616 cannot be mapped to 'a character' in any known or proposed mapping.

    Which makes this:

    In Perl [or any language], a character is a number in 0 to UVMAX.
    stand out as the total twaddle it is.

Unicode support in Perl is broken. And until people like you stop pretending that it isn't it will stay that way.

Indeed, until those that do, stop trying to pretend that you can transparently handle the abortion that is Unicode, whether retro-fitting an existing language or implementing a new one, the longer it will be before we can evolve some sane semantics for handling MBCSs.


Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.

In reply to Re^5: Simplest Possible Way To Disable Unicode by BrowserUk
in thread Simplest Possible Way To Disable Unicode by JapanIsShinto

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.