Except of course where the embedded null is a legitimate part of a multi-byte character--which means a full unicode verification of every string passed to every system API
Um, sorry, but if you're talking about the perl-internal representation of unicode, which is utf-8, the only thing that involves a null byte is the unicode code-point U+0000 (i.e. "NULL") -- and this, BTW, is simply the single-byte-null itself in utf-8. For every other utf-8 character, every byte is always non-null. And I don't know of any pre-unicode encodings that use nulls as parts of multi-byte characters.
If a string of octets is supposed to represent utf-16, then sure, we would expect some of those octets to be null -- each octet is supposed to be treated as half of a 16-bit binary "word"; but this is a very different situation. Here we are talking about something more akin to plain old raw binary data, not a string of characters that can be transmuted directly to a char* and treated as a string in C.
|Replies are listed 'Best First'.|
Re^4: How is perl able to handle the null byte?
by BrowserUk (Patriarch) on Jun 16, 2006 at 23:33 UTC