in reply to UTF8 Validity
UTF8 ne UTF-8. You can have a string that is valid UTF8 but not valid UTF-8 (UTF-8 is more strict, and allows just one way to encode each codepoint, UTF8 also allows non-canonical encodings).
That was my first thought when I read the title "UTF8 Validity", which is not "UTF-8 Validity" ;-)
|
|---|