in reply to Re^2: 5.42: Does m// toss a string around?
in thread 5.42: Does m// toss a string around?

I think the PV method constructs new string rather than returns numeric value

No.

SvPVX creates an object that provide information about a scalar ($_[0] aka $s), then uses that object's PV method to obtain the address of the string buffer of $s.

I now suspect similar reasons, related to COW, can you explain please?

COW was introduced in 5.20. Before 5.20, hacks which malfunctioned and/or a more expensive alternative had to be used.

Replies are listed 'Best First'.
Re^4: 5.42: Does m// toss a string around?
by Anonymous Monk on Jan 29, 2026 at 23:30 UTC

    (OK, the PV method doesn't matter) I thought I got it (s/// implies matching; and $& and friends COW-share, in a way, original buffer; hence replacement operator (after 5.18) always re-allocates) but now I think the following example shows "match and manually replace" doesn't force re-allocation, then why s/// does?

    use Devel::Peek qw( Dump ); $Devel::Peek::pv_limit = $Devel::Peek::pv_limit = 10; $_ = "x" x 99; $_ .= "x"; # Force unsharing. Dump( $_ ); /(.+)/; Dump( $_ ); $_ = 'aaa'; Dump( $_ ); print "\n\n"; $_ = "x" x 99; $_ .= "x"; # Force unsharing. Dump( $_ ); s/(.+)/aaa/; Dump( $_ ); __END__ 5.042000 SV = PV(0x1e2951e23e0) at 0x1e29521d298 REFCNT = 1 FLAGS = (POK,pPOK) PV = 0x1e295228ad0 "xxxxxxxxxx"...\0 CUR = 100 LEN = 101 SV = PV(0x1e2951e23e0) at 0x1e29521d298 REFCNT = 1 FLAGS = (POK,pPOK) PV = 0x1e295228ad0 "xxxxxxxxxx"...\0 CUR = 100 LEN = 101 SV = PV(0x1e2951e23e0) at 0x1e29521d298 REFCNT = 1 FLAGS = (POK,pPOK) PV = 0x1e295228ad0 "aaa"\0 CUR = 3 LEN = 101 SV = PV(0x1e2951e23e0) at 0x1e29521d298 REFCNT = 1 FLAGS = (POK,pPOK) PV = 0x1e295228ad0 "xxxxxxxxxx"...\0 CUR = 100 LEN = 101 SV = PV(0x1e2951e23e0) at 0x1e29521d298 REFCNT = 1 FLAGS = (POK,pPOK) PV = 0x1e295215d60 "aaa"\0 CUR = 3 LEN = 16

      So there's really two questions here.


      The first is why wasn't COW used for matching half of the code.

      COW is normally used for the copy for $& and friends, but it wasn't used in this case because there's not enough space in the buffer.

      CUR = 100 LEN = 101

      I think two bytes are needed. One for the NUL, and one for the COW reference count. Replace $_ .= "x"; with chop;, and you get:

      SV = PV(0x5deca7958ee0) at 0x5deca7994d68 REFCNT = 1 FLAGS = (POK,pPOK) PV = 0x5deca7998850 "xxxxxxxxxx"...\0 CUR = 98 LEN = 101 SV = PV(0x5deca7958ee0) at 0x5deca7994d68 REFCNT = 1 FLAGS = (POK,IsCOW,pPOK) PV = 0x5deca7998850 "xxxxxxxxxx"...\0 CUR = 98 LEN = 101 COW_REFCNT = 1 SV = PV(0x5deca7958ee0) at 0x5deca7994d68 REFCNT = 1 FLAGS = (POK,IsCOW,pPOK) PV = 0x5deca797ad20 "aaa"\0 CUR = 3 LEN = 16 COW_REFCNT = 1

      The second question is why does the scalar get a new buffer for the substitution.

      I'm guessing the buffer of the scalar is stolen rather than copied for $& and friends, since the expectation is that it will change.

      That means the scalar always gets a new buffer.

        COW is normally used for the copy for $& and friends, but it wasn't used in this case because there's not enough space in the buffer

        Thanks. I think it follows that appending a character practically poisons a string in Perl (try 1e7 if on modern PC):

        $ time perl -e '$_ = "a" x 1e6; $_ .= "a"; 1 while /./g' real 0m35.868s user 0m35.857s sys 0m0.008s

        And so as not to mention 5.18 again, and I'm not advocating the use of s/// as below but FWIW:

        $ perlbrew exec --with 5.18.4,5.42.0 time -p perl -e ' > $_ = 1 x 1e6; > 1 while s/.$//; > ' perl-5.18.4 ========== real 0.14 user 0.13 sys 0.00 perl-5.42.0 ========== real 18.95 user 18.94 sys 0.00