go ahead... be a heretic | |
PerlMonks |
Re^12: Seeking Perl docs about how UTF8 flag propagates (Terminology)by ikegami (Patriarch) |
on May 18, 2023 at 16:42 UTC ( [id://11152277]=note: print w/replies, xml ) | Need Help?? |
Really? Character's definition is wrong. Perl has 32 or 64 bit chars, not 32 bit chars. And these days, it's usually 64 bit chars. Theoretically, the encoding supports 72 bit chars, but they gotta fit in a UV for Perl to be able to work with them, so the size of a UV controls the range of a char. It's also not that clear. The important part is that a character is an element of a string. For example, substr( $_, $i, 1 ) returns the character at offset $i. I would also mention the range, but only after saying it's the element of a string. We could also mention chr and ord as means of switching between representation of that chararacter. But, while I disagree with the wording, I do agree with the term and its meaning. Then there's byte and octet, and I can't tell how they are different from each other from the Terminology section. And anyone that uses different these two words to mean two different things needs to find better terms. And that's it? Where's the rest? The phrases we actually need? If we continue on to next section it starts with saying that encode "Encodes the scalar value STRING from Perl's internal form into ENCODING and returns a sequence of octets." wtf does the internal form has to do with anything? The input is expected to be a string of Unicode Code Points. The internal form of those UCP is not relevant. Four sentences, and five problems. And that's not counting all the missing terminology. This is not what I was expecting when you gave it a gold star.
In Section
Seekers of Perl Wisdom
|
|