in reply to Re^2: Access via substr refs 2000 times slower
in thread Access via substr refs 2000 times slower

I wonder if this change is tied to the work that was done allowing copy on write strings. Modifying a large string in place via a reference doesn't look like it would play well with the idea of having multiple variables using copy-on-write so they can share one actual copy of a large string.
  • Comment on Re^3: Access via substr refs 2000 times slower

Replies are listed 'Best First'.
Re^4: Access via substr refs 2000 times slower
by BrowserUk (Patriarch) on Dec 28, 2008 at 21:39 UTC

    Dunno! According to ikegami, lvalue refs have always caused the substring to be copied at least as long as I've been using Perl--circa 5.6.1. Though I'd have sworn they never used to.

    Also, despite the copying, using the ref as an lvalue still modifies the original string in-place:

    $s = 'the quick brown fox';; $r = \substr $s, 10, 5;; $$r = 'green';; print $s;; the quick green fox

    The same is true for multiple copies of the lvalue ref:

    $r2 = $r;; $$r2 = 'orange';; print $s;; the quick orange fox

    And from my reading of the dump of an lvalue ref, it carries all the information needed to access the substring. A reference to the original SV, and the offset & length of the substring:

    print Dump $r2;; SV = RV(0x186d8d0) at 0x196d088 REFCNT = 1 FLAGS = (ROK) RV = 0x196d0dc SV = PVLV(0x186c9f4) at 0x196d0dc REFCNT = 2 FLAGS = (PADMY,GMG,SMG,pPOK) IV = 0 NV = 0 PV = 0x19be29c "orange"\0 ######## This seems to be redundant to m +e. CUR = 6 LEN = 7 MAGIC = 0x182a75c MG_VIRTUAL = &PL_vtbl_substr MG_TYPE = PERL_MAGIC_substr(x) TYPE = x TARGOFF = 10 ### Offset TARGLEN = 5 ### Length TARG = 0x2350bc SV = PV(0x2354ac) at 0x2350bc ### The original SV REFCNT = 2 FLAGS = (POK,pPOK) PV = 0x182a7bc "the quick orange fox"\0 CUR = 20 LEN = 21

    So making a copy of the substring seems redundant and profligate, as well a dashed inconvenient for my purposes.

    I'm also very skeptical of there being any real benefits to the Holy COW for Perl anyway.

    It is far more efficient to request a single block of pages from the OS and then REP MOVSD dst, src to duplicate it on mass, than request it page by page and copy it piecemeal everytime a reference count gets changed; or string is used in a numeric context; or a number is interpolated into a string; or any of the myriad other 'read-only' touches to memory that would necessitate COW being invoked.

    One expensive kernel call and one relatively fast user-space operation, versus dozens, hundreds or thousands of expensive ring3-ring0-ring3 transitions, not to mention the cost of the cache flushes.


    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority".
    In the absence of evidence, opinion is indistinguishable from prejudice.
Re^4: Access via substr refs 2000 times slower
by ikegami (Patriarch) on Dec 29, 2008 at 00:29 UTC

    It's not related to COW. For example, let's look at the following snippet from (the arbitrarily chosen) pp_concat function ("." operator):

    else { /* TARG == left */ STRLEN llen; SvGETMAGIC(left); /* or mg_get(left) may happen here */ if (!SvOK(TARG)) { if (left == right && ckWARN(WARN_UNINITIALIZED)) report_uninit(right); sv_setpvn(left, "", 0); } (void)SvPV_nomg_const(left, llen); /* Needed to set UTF8 flag */ lbyte = !DO_UTF8(left); if (IN_BYTES) SvUTF8_off(TARG); }

    SvGETMAGIC(left) is what calls the 'get' magic and stores the result in the SV if the LHS is magical. If it didn't work that way, the following sv_setpvn, SvPV_nomg_const and DO_UTF8 would all have to be handled by the magic. It doesn't make sense to have every type of magic and every tied variable handle every perlapi and internal Perl function that might affect it.

    You could shorten the life of the copy to where it's needed, but Perl doesn't even do that for lexical variables. Their PV remains allocated when the variable is out of scope.

    >perl -MDevel::Peek -e"sub f { my $s; Dump $s if $i++; $s='abc' } f;f" SV = PV(0x238e44) at 0x182ee40 REFCNT = 1 FLAGS = (PADBUSY,PADMY) PV = 0x18250bc "abc"\0 CUR = 3 LEN = 4