in reply to Re^19: Interleaving bytes in a string quickly
in thread Interleaving bytes in a string quickly
Gotcha! (at last!).
Gotcha what? I've said the same thing for the beginning.
After utf8::upgrade($byte_string);, $byte_string is no longer a byte string. It's a character string, or a codepoint string.
No, it's not. Just like a number can be stored in an IV, an UV, an NV or a PV, a string of bytes is a string of bytes can be stored in a PV w/ UTF8=0 or a PV w/ UTF8=1. It's the same bytes no matter what internal format is used. Perl does not assign meaning to the values inside the string*.
If you guarantee that you give a string in UTF8=0 format (say by calling utf8::downgrade before calling interleave) you won't have a problem. That hadn't been specified until now. Instead of trying to gotcha me, maybe you should have said what you want to say. Your games aren't fun.
...can silently encode your bytes using UTF-8.", is just plain fiction. Any encoding has to be done, explicitly, by the programmer.
No, it doesn't. I even gave an example.
if (!sv_in || !sv_pad) croak("usage"); is a redundant code path
Ah good. I wasn't sure, so I erred on the safe side. In many places in the core, SV* can be NULL.
STRLEN i = l_in; is never used.
Thanks, Fixed.
It can also segfault! Try this print interleave_bytes( undef, 0 );
Thanks, Fixed by restoring my initial approach on which I had tested that. (The problem is that SvPV(0) is special.)
So, if you're gonna stand on that high horse throwing stones, you really ought to make sure that your mount doesn't have a glass jaw!
Pointing out a problem and explaining it when asked does not fit that description.
And your suggestion that I say I refuse to say anything until I can produce perfect code every time is just ridiculous. I have no problem addressing my problems.
* — Unless you ask it to. For example, uc will treat the values as unicode characters (regardless of the storage format), and vec will treat the values as bit strings (regardless of the storage format).
Update: Additions made to address the second have of the parent. I hadn't read it initially figuring it was just "gotcha!". But there were useful comments.
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re^21: Interleaving bytes in a string quickly
by BrowserUk (Patriarch) on Mar 01, 2010 at 13:46 UTC | |
by ikegami (Patriarch) on Mar 01, 2010 at 13:57 UTC | |
by BrowserUk (Patriarch) on Mar 01, 2010 at 14:20 UTC | |
by ikegami (Patriarch) on Mar 01, 2010 at 14:50 UTC | |
by BrowserUk (Patriarch) on Mar 01, 2010 at 14:59 UTC | |
| |
by BrowserUk (Patriarch) on Mar 01, 2010 at 16:43 UTC |