in reply to Re^3: Tracking down an Lvalue bug?
in thread Tracking down an Lvalue bug?

Demonstrate that taking an lvalue substr ref causes the referenced substring to be copied, And that this is a bug.

Replies are listed 'Best First'.
Re^5: Tracking down an Lvalue bug?
by ikegami (Patriarch) on Mar 20, 2012 at 03:36 UTC
      Related to ref to read-only alias ... why??

      I don't think so.

      This demonstrates the problem (bug):

      perl -E"$x=chr(0); $x x= 100e6; <>; $r = \substr $x, 50e6; <>"

      Watch the process memory with ProcessManager:

      1. When it pauses the first time memory usage is ~98MB. Hit enter
      2. Now the memory usage shoots up to ~145MB.

        All that's changes is that it has taken a substr ref to the second half of the string.

      I suspect it is a result of this "fix".

      But I believe that fix is wrong, and that the original bug is spurious. (At least in part.)

      If you do this:

      my $ref; { my $string = "123"; $ref = \$string; } ## Here $string persists.

      No one is surprised that (the memory allocated to) $string persists beyond the block.

      Why should this be any different:

      my $ref; { my $string = "123"; $ref = \substr $string, 1, 1; } ## This was the (IMO false) "bug" scenario.

      In this case -- when the string goes out of scope -- I could make arguments for 3 possible outcomes:

      1. The memory allocated to $string persists:

        $ref continues to refer (directly) to the memory allocate to $string.

        Those parts of $string outside of $ref are inaccessible and irrecoverable until $ref is GC's.

      2. The memory allocated to $string is GC'd; $ref becomes undef.

        Probably difficult to orchestrate.

      3. The memory allocated to $string persists until $ref is used the next time:

        At which point the extraneous (pre and/or post fix) bits of $string are GC'd leaving $ref pointing at just that which it references.

        This would require retaining a pointer back to 'parent' string with the lvalue ref, and when the lvalue magic fires it checks the refcount of its parent.

        If that refcount is 1 -- meaning it holds the only reference to it, it frees off the extraneous parts of the parent.

      Of the three, I'd prefer C, but would find A completely in keeping with Perl's philosophy.

      Duplicating the referenced memory of every lvalue ref and the having to copy it back each time it is modified -- just to cover off a really obscure scenario that is at worst, only mildly anomalous -- is crazy.


      With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
      Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
      "Science is about questioning the status quo. Questioning authority".
      In the absence of evidence, opinion is indistinguishable from prejudice.

      The start of some sanity?

        For what its worth, I concur.

        - tye        

        The fix has no effect on:

        my $ref; { my $string = "123"; $ref = \substr $string, 1, 1; } ## This was the (IMO false) "bug" scenario.

        The fix is for:

        { my $string = "123"; my $ref = \substr $string, 1, 1; }

        All the patch does (or is suppose to do) is avoid saving the SV inside of the OP, something that causes $string until the end of the program in the second snippet.

        Do you have a way of automating the test so I can use git bisect to find when the behaviour changed (if it did).

        The patches for RT#67838 are is found in:

        5.13.4+ 5.14.0+

        I ran your test program with ActivePerl 5.14.0 (32 bit) on Windows, and noticed NO memory increase. (Steady at 99MB.) In fact, I saw in increases only up to that version of Perl.

        perl -E"$x=x; $x x= 100e6; <>; $r = \substr $x, 50e6; <>; ord $$r; <>" AP 5.12.4 32-bit 99M 148M 148M AP 5.14.0 32-bit 99M 99M 148M AP 5.14.2 32-bit 99M 99M 148M

        There will be a memory increase if you read the string $$r, but that's how magic works in Perl.


        I looked at the source code at the time of the patch. There is no intentional "prefetching", and testing shows no accidental prefetching:

        >\progs\perl5142-ap1402\bin\perl -MDevel::Peek -E"$x=chr(0); $x x= 100 +; Dump substr $x, 50;" SV = PVLV(0x4de51c) at 0x497b04 REFCNT = 1 FLAGS = (TEMP,GMG,SMG) IV = 0 NV = 0 PV = 0 <----- No buffer MAGIC = 0x4cc694 MG_VIRTUAL = &PL_vtbl_substr MG_TYPE = PERL_MAGIC_substr(x) TYPE = x TARGOFF = 50 TARGLEN = 50 TARG = 0x4a932c SV = PV(0x25603c) at 0x4a932c REFCNT = 2 FLAGS = (POK,pPOK) PV = 0x4b36cc "\0\0\0...\0\0\0"\0 CUR = 100 LEN = 104

        In contrast with a version that does grow as soon as substr is called:

        >\progs\perl5124-ap1205\bin\perl -MDevel::Peek -E"$x=chr(0); $x x= 100 +; Dump substr $x, 50;" SV = PVLV(0x319e4c) at 0x3bf54 REFCNT = 1 FLAGS = (PADMY,GMG,SMG,pPOK) IV = 0 NV = 0 PV = 0x2f42dc "\0\0\0...\0\0\0"\0 <----- CUR = 50 LEN = 52 MAGIC = 0x327f94 MG_VIRTUAL = &PL_vtbl_substr MG_TYPE = PERL_MAGIC_substr(x) TYPE = x TARGOFF = 50 TARGLEN = 50 TARG = 0x2f875c SV = PV(0x36034) at 0x2f875c REFCNT = 2 FLAGS = (POK,pPOK) PV = 0x330f6c "\0\0\0...\0\0\0"\0 CUR = 100 LEN = 104

        $ref continues to refer (directly) to the memory allocate to $string.

        Neither $ref nor $$ref ever refer directly to $string's PV. They can't because the address of $string's buffer can change as $string changes.

        This would require retaining a pointer back to 'parent' string with the lvalue ref

        $$ref does have a reference to $string.

        Probably difficult to orchestrate.

        Actually, that's easy. Just weaken $$ref's reference to $string.

        At which point the extraneous (pre and/or post fix) bits of $string are GC'd leaving $ref pointing at just that which it references.

        One can't free the start of a memory block. I don't think one can even free the end of a memory block. The string would have to be copied to a new buffer in order to shrink the buffer. Doable, but it would require a temporary "doubling" of memory.

        It would also be uncharacteristic of Perl. Perl intentially avoids freeing memory left and right. Shrinking a buffer would be a first!