Or you could standardize the internal representation. A string is a sequence of code points. Storing the sequence length could be handy when dealing predominantly with string objects. Then the following cases arise:
Extended 8-bit charsets (ISO8859) suffer with UTF-8 internal representation, unless you hack the (ncodepts==nbytes) to indicate native format...
More interesting is the interaction between objects. Considering a blob and a string object:
When is the blob promoted to a string, when does the opposite happen? Object representation and efficiency are certainly big concerns, but surely the semantic implications of unicode are far more insidious.$foo = ($str . $obj); $bar = ($obj . $str); $baz = "${obj}${str}";
In reply to Re^5: JSON::XS (and JSON::PP) appear to generate invalid UTF-8 for character in range 127 to 255
by oiskuu
in thread JSON::XS (and JSON::PP) appear to generate invalid UTF-8 for character in range 127 to 255
by Ovid
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |