Re^3: XS efficient copy (SvCUR

Replies are listed 'Best First'.
Re^4: XS efficient copy (SvCUR_set) by creamygoodness (Curate) on Apr 07, 2006 at 03:31 UTC
Hi Tye, While it's true that `SvPV_nolen` is safer for reading, `SvPVX` and `SvPV_nolen` are equally dangerous with regards to the write operation that TheDauthi wants to deploy. In both cases, it is absolutely necessary to... allocate space via `newSV(STRLEN)`, `SvGROW(SV, STRLEN)`, etc. make the SV "POK" via `SvPOK_on(SV)`, so that it knows it contains a string. The `SvPV_nolen` macro first checks the SV's private `SVf_POK` flag to see whether it contains a string. If it does, then it returns the pointer to the string, via `SvPVX(SV)`. If it doesn't, it calls `sv_2pv_flags`, which spits out a `Use of uninitialized value` warning, upgrades the sv, and returns `(char)""`. That's safe to read from, but if you try to write to it... kaboom. Here's a demo app... `#!/usr/bin/perl use strict; use warnings; use Inline C => <<'END_C'; void POKe() { SV good_sv, bad_sv; char good_ptr, bad_ptr; good_sv = newSV(83); bad_sv = newSV(83); SvPOK_on(good_sv); / !!!! */ good_ptr = SvPV_nolen(good_sv); Copy("Joy!", good_ptr, 4, char); SvCUR_set(good_sv, 4); fprintf(stderr, "%s\n", SvPVX(good_sv)); bad_ptr = SvPV_nolen(bad_sv); fprintf(stderr, "wait for it...\n"); Copy("DEATH!", bad_ptr, 6, char); fprintf(stderr, "in heaven, everything is fine..."); } END_C POKe();` [download] ... and here's the output on my system... `slothbear:~/perltest marvin$ perl sv_poke.plx Joy! Use of uninitialized value in subroutine entry at sv_poke.plx line 33. wait for it... Bus error slothbear:~/perltest marvin$` [download] -- Marvin Humphrey Rectangular Research ― http://www.rectangular.com	[reply] [d/l] [select]
Re^5: XS efficient copy (SvCUR_set) by tye (Sage) on Apr 07, 2006 at 06:46 UTC
Thanks for the corrections. My memory was rusty and I didn't double check it. Mea culpa. Having double checked now, I'd still do things differently than you did but I realized more of the reasons for my reluctance to using SvPVX() and SvPOK_on() and those had to do with dealing with scalars passed in rather than one freshly created right there. So I now agree that your suggestion is safe (for this particular case) (and mine was fatally flawed, of course). I'd personally still avoid using SvPVX() and SvPOK_on() as sticking with techniques that work in both cases makes sense to me. But, I see I used SvPV_force() and getting the "use of undefined value" warning was actually a desired feature. If I was creating a new scalar, then I'd `sv_setpvn(sv,"",0)` before doing SvPV_force() and thus not get the warning when it wasn't appropriate (in part because these two steps would likely be in separate macros "of my own design"). But that has a trivial amount of extra overhead that some might dislike. I'd also use SvPOK_only() but do that and the SvCUR_set() only after the call to fill the allocated buffer had succeeded. So I guess I'd do something like this in 3 steps: 1) Allocate a scalar containing an empty string (preferably with the desired size of buffer), 2) Prepare the scalar to have a large enough buffer and extract a pointer to it (using a macro that doesn't assume a pristine SV), 3) (after the buffer has been "filled") Mark the scalar as containing the string of the proper length. I find that layer of abstraction makes the XS subroutine easier to understand and each of my three macros easier to understand in isolation. In fact, I spent some concerted study on each of my macros, carefully verifying implementation details of the "sv" macros that they use to make sure my macros were correct and safe. And then I hapilly forgot most of the niggling details of all of those "sv" macros which I'd never faithfully remember anyway. Thanks again and sorry for the misinformation. - tye	[reply] [d/l]
Re^6: XS efficient copy (SvCUR_set) by creamygoodness (Curate) on Apr 07, 2006 at 18:36 UTC
SvCUR_set() only after the call to fill the allocated buffer had succeeded. I like this idiom and I'm going to adopt it. -- Marvin Humphrey Rectangular Research ― http://www.rectangular.com	[reply]
Re^5: XS efficient copy (SvCUR_set) by TheDauthi (Sexton) on Apr 07, 2006 at 20:36 UTC
The SvPV_nolen macro first checks the SV's private SVf_POK flag to see whether it contains a string. If it does, then it returns the pointer to the string, via SvPVX(SV). If it doesn't, it calls sv_2pv_flags, which spits out a Use of uninitialized value warning, upgrades the sv, and returns (char)"". That's safe to read from, but if you try to write to it... kaboom. That was actually why I was calling SvPV_nolen (to force the SV to contain a string), but my understanding of it was a bit incorrect. I was expecting it to set that flag, but I was also expecting the return of the pointer to the existing buffer I had allocated with newSV, not an empty string. When I realized that perl also had the length of the string (which, in retrospect, is obvious), the other part fell into place. I'm actually thinking that it IS safer to use SvPVX to get the buffer, so that I don't promote it 'by accident'.	[reply]
Re^6: XS efficient copy (SvCUR_set) by creamygoodness (Curate) on Apr 09, 2006 at 15:35 UTC
I'm actually thinking that it IS safer to use SvPVX to get the buffer, so that I don't promote it 'by accident'. I agree that using `SvPVX` in this context is better practice. If you haven't allocated memory for the string, `SvPVX` will attempt to dereference a struct member that isn't there, and segfault. If you haven't both* allocated memory and made the string POK, `SvPV_nolen` will return a buffer that's not writeable, and you'll get a segfault when you try to write to it. perlapi seems to suggest that `SvPV_nolen` is safer. That is not true for this usage, and putting it in there lends a false sense of security. Better to use `SvPVX` -- so it's clear that you're working without a net. If you're lucky. In general, Valgrind is extremely helpful for detecting these kinds of problems, and I strongly recommend it. Sadly, I don't think it will help in this specific case, because it's Linux-only (there's also a FreeBSD variant, but it doesn't seem to work with Perl). -- Marvin Humphrey Rectangular Research ― http://www.rectangular.com	[reply] [d/l] [select]