in reply to [XS] sv_setpv change in behaviour with perl-5.42.0 and later

The difference is that 5.42 made constants produced by constant folding eligible for buffer sharing ("COW").[1]

Use SvGROW if you want the buffer to have a minimum size.


In the 5.42 run, $buffer initially shares a buffer with the constant created by 'z' x 60. This is evident by the IsCOW flag indicating the buffer is shared with another scalar.

This means that set_pv must create a new buffer. Notice how the address of the buffer changed from 0x2352275d050 to 0x2352279a8a0.

There's no reason for set_pv to create a new buffer whose length is based on the old string buffer's length. The new buffer's length will be based on length of the string being assigned.

SV = PV(0x2352069fd80) at 0x235206db678 REFCNT = 1 FLAGS = (POK,IsCOW,pPOK) <- IsCOW = Shared buffer PV = 0x2352275d050 "zzz...zzz"\0 CUR = 60 LEN = 64 COW_REFCNT = 1 SV = PV(0x2352069fd80) at 0x235206db678 REFCNT = 1 FLAGS = (POK,pPOK) <- No longer sharing a buffer PV = 0x2352279a8a0 "Hello there"\0 <- New buffer at new address CUR = 11 LEN = 16

In the 5.40 run, $buffer doesn't share a buffer with another scalar, as noted by the lack of the IsCOW flag.

This means that set_pv can reuse the existing buffer if it's large enough. And it is. Notice how the address of the buffer remains 0x2077084ae90.

SV = PV(0x2076e35a3b0) at 0x2076e41b338 REFCNT = 1 FLAGS = (POK,pPOK) <- Not sharing a buffer PV = 0x2077084ae90 "zzz...zzz"\0 CUR = 60 LEN = 62 SV = PV(0x2076e35a3b0) at 0x2076e41b338 REFCNT = 1 FLAGS = (POK,pPOK) PV = 0x2077084ae90 "Hello there"\0 <- Same address. Same buffer CUR = 11 LEN = 62

So why is the buffer shared with 'z' x 60 in one version and not the other?

5.42 fixed a bug that prevented the buffer of constants created by constant folding from being shared. An in-depth explanation of the bug follows.

When a string buffer is shared, the IsCOW flag of both scalars is set, and a share count is placed in the unused portion of the buffer.[2] This means that for COW to be used, there must be free space at the end of the string buffer, and the string buffer must be modifiable.

When Perl encounters a literal, it produces a read-only scalar in memory.[3] Being read-only makes it ineligible for COW. But that would be dumb. So Perl marks the scalar as already being shared with zero scalars.

$ perl -MDevel::Peek -e'Dump( "zzzzzz" )' SV = PV(0x57f44e969f20) at 0x57f44e9980a8 REFCNT = 1 FLAGS = (POK,IsCOW,READONLY,PROTECT,pPOK) PV = 0x57f44e9e8140 "zzzzzz"\0 CUR = 6 LEN = 16 COW_REFCNT = 0

One wouldn't normally encounter a scalar shared with zero other scalars. But since Perl doesn't need to check if a scalar's buffer can be shared if it's already shared, this permits the read-only scalar's buffer to be shared.

For literals, this was true of both 5.42 and earlier versions.

$ 5.42t/bin/perl -MDevel::Peek -e'Dump( "zzzzzz" )' SV = PV(0x582cd0b72f20) at 0x582cd0ba1098 REFCNT = 1 FLAGS = (POK,IsCOW,READONLY,PROTECT,pPOK) PV = 0x582cd0ba4a40 "zzzzzz"\0 CUR = 6 LEN = 16 COW_REFCNT = 0 $ 5.40t/bin/perl -MDevel::Peek -e'Dump( "zzzzzz" )' SV = PV(0x6381a4328f20) at 0x6381a4357028 REFCNT = 1 FLAGS = (POK,IsCOW,READONLY,PROTECT,pPOK) PV = 0x6381a43a7d10 "zzzzzz"\0 CUR = 6 LEN = 16 COW_REFCNT = 0

But before 5.42, constants produced by constant folding weren't being set up this way, so they weren't eligible for COW.

$ 5.42t/bin/perl -MDevel::Peek -e'Dump( "zzz" . "zzz" )' SV = PV(0x6110ea0263a0) at 0x6110ea0540a0 REFCNT = 1 FLAGS = (PADTMP,POK,IsCOW,READONLY,PROTECT,pPOK) PV = 0x6110ea080fc0 "zzzzzz"\0 CUR = 6 LEN = 16 COW_REFCNT = 0 $ 5.40t/bin/perl -MDevel::Peek -e'Dump( "zzz" . "zzz" )' SV = PV(0x5be570a973a0) at 0x5be570ac50c0 REFCNT = 1 FLAGS = (PADTMP,POK,READONLY,PROTECT,pPOK) PV = 0x5be570af23e0 "zzzzzz"\0 CUR = 6 LEN = 16
$ 5.42t/bin/perl -MDevel::Peek -e'Dump( "z" x 6 )' SV = PV(0x607fa35c4200) at 0x607fa35f2138 REFCNT = 1 FLAGS = (PADTMP,POK,IsCOW,READONLY,PROTECT,pPOK) PV = 0x607fa3603d10 "zzzzzz"\0 CUR = 6 LEN = 16 COW_REFCNT = 0 $ 5.40t/bin/perl -MDevel::Peek -e'Dump( "z" x 6 )' SV = PV(0x5d8bda39d200) at 0x5d8bda3cafb8 REFCNT = 1 FLAGS = (PADTMP,POK,READONLY,PROTECT,pPOK) PV = 0x5d8bda3d5340 "zzzzzz"\0 CUR = 6 LEN = 16

  1. From perl5420delta,

    Constant-folded strings are now shareable via the Copy-on-Write mechanism. [GH #22163]

    The following code would previously have allocated eleven string buffers, each containing one million "A"s:

    my @scalars; push @scalars, ("A" x 1_000_000) for 0..9;

    Now a single buffer is allocated and shared between a CONST OP and the ten scalar elements of @scalars.

    Note that any code using this sort of constant to simulate memory leaks (perhaps in test files) must now permute the string in order to trigger a string copy and the allocation of separate buffers. For example, ("A" x 1_000_000).time might be a suitable small change.

  2. It must be somewhere all users of the buffer can find, and this is a very efficient solution in terms of speed and memory. But it means it can't be used for every scalar.

  3. So you don't do stupid things like

    for ( 1 .. 2 ) { my $r = \"abc"; say $$r; $$r = "def"; }

Replies are listed 'Best First'.
Re^2: [XS] sv_setpv change in behaviour with perl-5.42.0 and later
by syphilis (Archbishop) on Jan 29, 2026 at 01:23 UTC
    There's no reason for set_pv to create a new buffer whose length is based on the old string buffer's length.

    Except that, with earlier perls, the new buffer is the same length as the old buffer - so this is a change in behaviour, and one that I did not expect.
    Let's say there's another XSub to which I want to subsequently pass that buffer, and it's an XSub that requires the buffer to have at least (say) 50 bytes available. For example:
    void bar(unsigned char * buffer) { buffer[49] = 65; }
    On perl 5.40.0 I could re-use that PV that I created in the demo and pass it to bar(), because its SvLEN is still guaranteed to be at least 60.
    But on perl-5.42.0, SvLEN has been reduced to 16, so passing that PV to bar() will result in the buffer being overflowed.
    At least, that's the way it looks to me. (And I'm assuming that such buffer overflow is something to be avoided.)
    Nothing that can't be dealt with, of course - but nonetheless surprising.

    Here's a second script that demonstrates that change in behaviour:
    use strict; use warnings; use Devel::Peek; use Inline C =><<'EOC'; void foo(SV * buffer) { char *data = "Hello there"; sv_setpv(buffer, data); } void bar(unsigned char * buffer) { buffer[49] = 65; } void _set_CUR(SV * buffer, int bytes) { SvCUR_set(buffer, bytes); } EOC my $buffer = 'z' x 60; Dump $buffer; foo($buffer); Dump $buffer; bar($buffer); _set_CUR($buffer, 60); # Ensure that Devel::Peek::Dump will display al +l 60 bytes. Dump $buffer;
    On perl-5.40.0 and earlier, the final Devel::Peek::Dump reveals exactly what I expect:
    SV = PV(0x254ba8dbf08) at 0x254ba920660 REFCNT = 1 FLAGS = (POK,pPOK) PV = 0x254bce10dc8 "Hello there\x00zzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzz +zzzzAzzzzzzzzzz"\0 CUR = 60 LEN = 62
    On perl-5.42.0, the final Dump appears as:
    SV = PV(0x224fe4c0aa0) at 0x224fe4f4cd0 REFCNT = 1 FLAGS = (POK,pPOK) PV = 0x22480c0e170 "Hello there\x00\x00\x00\x00\x00\x00\x00\x00\x00\ +x00\x00\x00\x00\r\x8B7\xBC\x00\xF8\x00\x88\xC0\x82\x9F\x80$\x02\x00\x +00\x00\x00\x00\x00\x00\x00\x00\x00\x10A\x8D\x80$\x02\x00\x00\x00\xB5K +\xFE" CUR = 60 LEN = 16
    (The 'A' at index 49 can be seen if you look closely.)
    However, no-one else has been bothered by this - so I guess I just deal with it appropriately.
    I do have a working solution to my issue that avoids sv_setpv and avoids re-using the same PV. (I might try improving it, but I think it's good enough as it already stands. And it's probably the same as the solution I would have used even if this change of behaviour in 5.42.0 did not exist.)

    Thank you for all of the detail, BTW - much appreciated.
    In fact, thank you to all respondents.

    Cheers,
    Rob

      Except that, with earlier perls, the new buffer is the same length as the old buffer

      That's not true. When older Perls create a new buffer, they are just a bit larger than necessary, just like in 5.42.

      In all the examples where you claim there's a new buffer was allocated based on the size of the old one, you are mistaken. As I explained, no new buffer was allocated in those cases. set_sv is simply modifying the existing buffer, something you can't do with shared buffer. And since Perl never shrinks a buffer, modifying the buffer does not shrink it.[1]


      1. It can free it, e.g. using undef $s; (as opposed to $s = undef;), which could eventually result with a shorter buffer in $s. But I don't know of any circumstances in which it directly shrinks a buffer.

        ... modifying the buffer does not shrink it

        So what does SvLEN tell us about the buffer ?
        According to perlapi documentation:
        "SvLEN" Returns the size of the string buffer in the SV, not including any part attributable to "SvOOK". See "SvCUR".
        Now, I don't understand the reference to "SvOOK" and "SvCUR", but the bit that says "Returns the size of the string buffer in the SV" means (to me) that if the value of LEN (ie SvLEN) has been reduced, then size of the buffer has been reduced - ie the buffer has been shrunk.
        Not so ?

        Cheers,
        Rob
Re^2: [XS] sv_setpv change in behaviour with perl-5.42.0 and later
by ikegami (Patriarch) on Jan 27, 2026 at 16:51 UTC

    Added a lot to parent.

Re^2: [XS] sv_setpv change in behaviour with perl-5.42.0 and later
by ikegami (Patriarch) on Jan 27, 2026 at 17:17 UTC

    And updated my explanation.