Nope. I didn't miss it. I just didn't believe that you could get things so arse backward.
(Sorry about this, but the point needs to be stated clearly!), SvPVX() performs NO COERCIONS WHATSOEVER!.
Which makes that impossible. I therefore invite you to prove your assertion with code!
Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.
| [reply] |
SvPVX() performs NO COERCIONS WHATSOEVER
I know. It just doesn't necessarily return a pointer to the bytes of a byte string.
I therefore invite you to prove your assertion with code!
Well, I don't know of anything that omits the \0, so the remaining two can be shown using:
my $byte_string = "\x80\x81";
dump_sv_pvx($byte_string);
utf8::upgrade($byte_string);
dump_sv_pvx($byte_string);
Any function or PerlIO layer is free to do format switch, even if the string only contains bytes. It doesn't change the string at all. It's still the same string of bytes.
You didn't specify where the string came from. Maybe it came from lcss, for example, which can switch the internal format. (You discussed using lcss recently, IIRC.) If you need a specific format (and you do), SvPVX without a preceding format check is buggy.
| [reply] [d/l] [select] |
Gotcha! (at last!).
After utf8::upgrade($byte_string);, $byte_string is no longer a byte string. It's a character string, or a codepoint string. But not a "byte string". I was (as a result of your previous pedantry), very specific in my choice of title for this thread.
And, as I said back up there somewhere, "Data either originates from within my program, or from without. And in either case, Perl will treat it as bytes unless I do something explicit to indicate that it should do otherwise. And since I know I'm not going to do that, I do not have to consider it.".
And, despite your continued attempts to defend it, your assertion that my posted code "...can silently encode your bytes using UTF-8.", is just plain fiction. Any encoding has to be done, explicitly, by the programmer. It cannot occur "silently".
And, "Magic isn't handled if any is present." is irrelavant!
So that brings us back to "It can segfault ...". Guess what:
Ignoring that your attempt to correct your perceived deficiencies in my code, contains
if (!sv_in || !sv_pad)
croak("usage");
which is a redundant code path that will never be exercised. And this:
{
STRLEN i = l_in;
which is never used.
It can also segfault! Try this
print interleave_bytes( undef, 0 );
And that's not the only failure mode it displays.
So, if you're gonna stand on that high horse throwing stones, you really ought to make sure that your mount doesn't have a glass jaw!
Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.
| [reply] [d/l] [select] |