in reply to Re^9: Mugged by UTF8, this CANNOT be right
in thread Mugged by UTF8, this CANNOT be right

Except that he's having problems with Unicode encodings.

He didn't specify which encodings are involved, and it really doesn't matter. He's having problems with encodings, and he'd have the same problems no matter which encoding were involved.

as a shorthand for one or more of the encodings embodied in the Unicode standard(s)

There are no encodings embodied in the Unicode standard. At least that's what I was told, and I don't see any myself. Feel free to point out where.

So again, Perl is great at handling Unicode. It's probably the best at it. Perl's support is so good that the Consortium is asking for advice from Perl's developers in defining behaviour that it never defined well because noone else had attempted to implement it yet.

Encodings, that's another issue. That's what was being discussed.

  • Comment on Re^10: Mugged by UTF8, this CANNOT be right

Replies are listed 'Best First'.
Re^11: Mugged by UTF8, this CANNOT be right
by NodeReader (Initiate) on Jan 27, 2011 at 21:38 UTC
    There are no encodings embodied in the Unicode standard. At least that's what I was told, and I don't see any myself. Feel free to point out where.
    See section 3.9 in this PDF. This is a draft of Unicode 6.0, but this section has been in the Unicode standard for some time.
      Thanks, I stand corrected.
      If you're such an expert, why not just point out that UTF means Unicode Transformation Format and have that as the coup d'grace. Hmmm???
        I wouldn't say I'm an expert at anything other than Reading Nodes. :-) As far as Unicode, UTF-whatever and other text encoding schemes are concerned, I'm probably more confused than anyone. Over the years, I've read a lot of standards, though (professional hazard), and I at least think I know how to interpret what I've read.

        I know, DFTT.

        To paraphrase one of my favorite monks, I Go Back to Reading, Now.