in reply to Re: Re: Re: Re: How are regex character classes implemented?
in thread How are regex character classes implemented?
Yes, there is a big difference between the code points and the encodings. A capital 'A' is the value 65. How you store the 65 in your program is beside the point. It could be a 7-bit integer, a 64-bit integer, a floating-point number, a string of EBCDIC digits, Huffman-encoded variable-length fields, or whatever.
UTF-8 is great for the reasons you list, and for a few others: it's a strict superset of ASCII, and it's byte-order neutral.
—John
|
|---|