in reply to Re: [XS] sv_setpv change in behaviour with perl-5.42.0 and later
in thread [XS] sv_setpv change in behaviour with perl-5.42.0 and later

There's no reason for set_pv to create a new buffer whose length is based on the old string buffer's length.

Except that, with earlier perls, the new buffer is the same length as the old buffer - so this is a change in behaviour, and one that I did not expect.
Let's say there's another XSub to which I want to subsequently pass that buffer, and it's an XSub that requires the buffer to have at least (say) 50 bytes available. For example:
void bar(unsigned char * buffer) { buffer[49] = 65; }
On perl 5.40.0 I could re-use that PV that I created in the demo and pass it to bar(), because its SvLEN is still guaranteed to be at least 60.
But on perl-5.42.0, SvLEN has been reduced to 16, so passing that PV to bar() will result in the buffer being overflowed.
At least, that's the way it looks to me. (And I'm assuming that such buffer overflow is something to be avoided.)
Nothing that can't be dealt with, of course - but nonetheless surprising.

Here's a second script that demonstrates that change in behaviour:
use strict; use warnings; use Devel::Peek; use Inline C =><<'EOC'; void foo(SV * buffer) { char *data = "Hello there"; sv_setpv(buffer, data); } void bar(unsigned char * buffer) { buffer[49] = 65; } void _set_CUR(SV * buffer, int bytes) { SvCUR_set(buffer, bytes); } EOC my $buffer = 'z' x 60; Dump $buffer; foo($buffer); Dump $buffer; bar($buffer); _set_CUR($buffer, 60); # Ensure that Devel::Peek::Dump will display al +l 60 bytes. Dump $buffer;
On perl-5.40.0 and earlier, the final Devel::Peek::Dump reveals exactly what I expect:
SV = PV(0x254ba8dbf08) at 0x254ba920660 REFCNT = 1 FLAGS = (POK,pPOK) PV = 0x254bce10dc8 "Hello there\x00zzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzz +zzzzAzzzzzzzzzz"\0 CUR = 60 LEN = 62
On perl-5.42.0, the final Dump appears as:
SV = PV(0x224fe4c0aa0) at 0x224fe4f4cd0 REFCNT = 1 FLAGS = (POK,pPOK) PV = 0x22480c0e170 "Hello there\x00\x00\x00\x00\x00\x00\x00\x00\x00\ +x00\x00\x00\x00\r\x8B7\xBC\x00\xF8\x00\x88\xC0\x82\x9F\x80$\x02\x00\x +00\x00\x00\x00\x00\x00\x00\x00\x00\x10A\x8D\x80$\x02\x00\x00\x00\xB5K +\xFE" CUR = 60 LEN = 16
(The 'A' at index 49 can be seen if you look closely.)
However, no-one else has been bothered by this - so I guess I just deal with it appropriately.
I do have a working solution to my issue that avoids sv_setpv and avoids re-using the same PV. (I might try improving it, but I think it's good enough as it already stands. And it's probably the same as the solution I would have used even if this change of behaviour in 5.42.0 did not exist.)

Thank you for all of the detail, BTW - much appreciated.
In fact, thank you to all respondents.

Cheers,
Rob

Replies are listed 'Best First'.
Re^3: [XS] sv_setpv change in behaviour with perl-5.42.0 and later
by ikegami (Patriarch) on Jan 29, 2026 at 02:53 UTC

    Except that, with earlier perls, the new buffer is the same length as the old buffer

    That's not true. When older Perls create a new buffer, they are just a bit larger than necessary, just like in 5.42.

    In all the examples where you claim there's a new buffer was allocated based on the size of the old one, you are mistaken. As I explained, no new buffer was allocated in those cases. set_sv is simply modifying the existing buffer, something you can't do with shared buffer. And since Perl never shrinks a buffer, modifying the buffer does not shrink it.[1]


    1. It can free it, e.g. using undef $s; (as opposed to $s = undef;), which could eventually result with a shorter buffer in $s. But I don't know of any circumstances in which it directly shrinks a buffer.

      ... modifying the buffer does not shrink it

      So what does SvLEN tell us about the buffer ?
      According to perlapi documentation:
      "SvLEN" Returns the size of the string buffer in the SV, not including any part attributable to "SvOOK". See "SvCUR".
      Now, I don't understand the reference to "SvOOK" and "SvCUR", but the bit that says "Returns the size of the string buffer in the SV" means (to me) that if the value of LEN (ie SvLEN) has been reduced, then size of the buffer has been reduced - ie the buffer has been shrunk.
      Not so ?

      Cheers,
      Rob
        I think this is a matter of semantics. A particular buffer, once allocated, has size SvLEN() and never shrinks. However, that buffer can (under some circumstances) be freed and a different buffer allocated with a smaller SvLEN().

        Dave.

        SvLEN is the size of the buffer currently referenced by SvPVX.

        I didn't say a scalar's buffer couldn't be replaced with a smaller one (which would result in a reduction of a scalar's SvLEN). But even so, it doesn't replace a scalar's buffer unless necessary either.

        That's why copying a short string over a long one in a buffer doesn't affect the buffer size (in any version of Perl).

        use Devel::Peek qw( Dump ); $_ = "x" x 999; $_ .= "x"; # Force unsharing. Dump( $_ ); $_ = "abc"; Dump( $_ );
        SV = PV(0x5b5d1f3fdee0) at 0x5b5d1f439bc8 REFCNT = 1 FLAGS = (POK,pPOK) PV = 0x5b5d1f441580 "xxx[...]xxx"\0 CUR = 1000 LEN = 1001 SV = PV(0x5b5d1f3fdee0) at 0x5b5d1f439bc8 REFCNT = 1 FLAGS = (POK,pPOK) PV = 0x5b5d1f441580 "abc"\0 <-- Same buffer CUR = 3 LEN = 1001 <-- Same size

        And that's why the difference between when SvLEN goes down or not is whether Perl must allocate a fresh buffer or not.


        OOK is a mechanism which can be used used to efficiently delete from the start of a string. Rather than shifting the entire contents of the buffer, OOK can be used to fake the start and size of the buffer instead.

        use Devel::Peek qw( Dump ); $_ = "abcdefghi"; $_ .= "j"; # Force unsharing. Dump( $_ ); substr( $_ , 0 , 1 ) = ""; Dump( $_ );
        SV = PV(0x600760853ee0) at 0x6007608a35d8 REFCNT = 1 FLAGS = (POK,pPOK) PV = 0x60076086a2e0 "abcdefghij"\0 CUR = 10 LEN = 16 SV = PV(0x600760853ee0) at 0x6007608a35d8 REFCNT = 1 FLAGS = (POK,OOK,pPOK) OFFSET = 1 PV = 0x60076086a2e1 ( "\x01" . ) "bcdefghij"\0 CUR = 9 LEN = 15

        There's still a 16 byte buffer at 0x60076086a2e0, and the scalar's buffer was replaced with a a 15 byte virtual buffer at 0x60076086a2e1.


        SvCUR is offered as a contrast. SvLENis the size of the buffer, and SvCUR is the portion used. Also, someone looking for SvCUR might have landed on SvLEN.