You said "And I cringe about calling a byte a character.". What does that even mean? Did someone say a byte is a character? Are you talking about something I said? In which case, what?
Your explanations, including the one to which you just linked, do not provide clarity.
| [reply] |
| [reply] |
Oh, you have a problem with the fact that you can store a byte in a character.
A character can be:
- Smallest addressable unit. Literally a synonym for byte.
- Element of the string.
- Grapheme
- Glyph
- Code point
In Perl, it has the second definition. There are no other words for this.
You apparently associate character with one of the last three. I don't know which.
For example, take a look at Å [U+212B], Å [U+C5] and Å [U+41,U+30A].
- They are all the same glyph, but the first one has a different grapheme.
- The last two are the same grapheme, but all use different code points.
So when you say character, do you think that all three of those things are the same? Only two? None of them? I have no idea. Unicode suggests most people would consider that list to have two characters: The Armstrong symbol, and Latin Capital Letter A with Ring Above. But most people isn't everyone. And that's why you should use the more precise term than character if you mean grapheme, glyph or code point. Standards exist for a reason.
| [reply] |