in reply to Re^3: Access via substr refs 2000 times slower
in thread Access via substr refs 2000 times slower
Dunno! According to ikegami, lvalue refs have always caused the substring to be copied at least as long as I've been using Perl--circa 5.6.1. Though I'd have sworn they never used to.
Also, despite the copying, using the ref as an lvalue still modifies the original string in-place:
$s = 'the quick brown fox';; $r = \substr $s, 10, 5;; $$r = 'green';; print $s;; the quick green fox
The same is true for multiple copies of the lvalue ref:
$r2 = $r;; $$r2 = 'orange';; print $s;; the quick orange fox
And from my reading of the dump of an lvalue ref, it carries all the information needed to access the substring. A reference to the original SV, and the offset & length of the substring:
print Dump $r2;; SV = RV(0x186d8d0) at 0x196d088 REFCNT = 1 FLAGS = (ROK) RV = 0x196d0dc SV = PVLV(0x186c9f4) at 0x196d0dc REFCNT = 2 FLAGS = (PADMY,GMG,SMG,pPOK) IV = 0 NV = 0 PV = 0x19be29c "orange"\0 ######## This seems to be redundant to m +e. CUR = 6 LEN = 7 MAGIC = 0x182a75c MG_VIRTUAL = &PL_vtbl_substr MG_TYPE = PERL_MAGIC_substr(x) TYPE = x TARGOFF = 10 ### Offset TARGLEN = 5 ### Length TARG = 0x2350bc SV = PV(0x2354ac) at 0x2350bc ### The original SV REFCNT = 2 FLAGS = (POK,pPOK) PV = 0x182a7bc "the quick orange fox"\0 CUR = 20 LEN = 21
So making a copy of the substring seems redundant and profligate, as well a dashed inconvenient for my purposes.
I'm also very skeptical of there being any real benefits to the Holy COW for Perl anyway.
It is far more efficient to request a single block of pages from the OS and then REP MOVSD dst, src to duplicate it on mass, than request it page by page and copy it piecemeal everytime a reference count gets changed; or string is used in a numeric context; or a number is interpolated into a string; or any of the myriad other 'read-only' touches to memory that would necessitate COW being invoked.
One expensive kernel call and one relatively fast user-space operation, versus dozens, hundreds or thousands of expensive ring3-ring0-ring3 transitions, not to mention the cost of the cache flushes.
|
|---|