in reply to Re^4: Best Way to Get Length of UTF-8 String in Bytes?
in thread Best Way to Get Length of UTF-8 String in Bytes?

I don’t know what all that Microsoft noise was for

My terminal uses cp437, and the garbage of encoding UTF-8 was there in the OP's output too. It just looks a bit different on my terminal ('中国 vs \x{00c3}\x{0089}).

nor the use utf8 either for that matte

Are you suggesting I should have made irrelevant changes to the OP's code?

And we are also aware of how unlikely it is to a problem for Jim given the data samples he displayed.

What do you mean unlikely? I'd say it's impossible since those characters are above U+00FF.

But so what. He's not going to deal with only those two characters.

I don't get it. In one breath, you say he should handle NFD. In the next, you say I should only concern myself with the characters he posted.

  • Comment on Re^5: Best Way to Get Length of UTF-8 String in Bytes?