That's an intersting thought, but doesn't appear to be the case.

# 3332k; $s = ' ' x 1_000_000; # 4316k; $s = substr $s, 0, 999_000; # 4316k; $s .= '?' x 2000; # 4316k;

Had that been the case, I would have expected to see memory growth when I appended to the copied scalar, but this doesn't happen. (On win32 anyway.)

Conversly, if it were a copy-on-write phenonema, then assigning the truncated substring to another scaler would likewise defer the copy until the new scalar was modified, which doesn't happen.

# 3336k; $s = ' ' x 1_000_000; # 4320; $t = substr $s, 0, 999_999; # 5308k;

Tracking the sources, I can't see any explicit step taken in pp_substr or sv_setpvn to avoid copying when the source and target are the same. However, the address of the target is known to the code at this point and a call is made to sv_GROW to ensure that the target (in this case the same as the source), is large enough, and it is here where any extra memory allocation would be performed. In this case, the target SV is the same as the source, and as the "growth" required is actually shrinkage, no allocation is necessary.

The actual copy of the data is (eventually) performed using the C-library call memmove().

This is the memcpy() look-alike that has extra nounce to deal with overlapping copies. In the case of a simple truncation, the logic -- which I don't have access to, but I can guess at -- probably results in simply copying a single null byte to the insertion point.

What actually happens is also dependant upon the C runtime used, but this is an obvious optimisation that probably exists in all versions of memmove()


Examine what is said, not who speaks.
"Efficiency is intelligent laziness." -David Dunham
"Think for yourself!" - Abigail
Hooray!


In reply to Re: Re: Re: Efficienty truncating a long string by BrowserUk
in thread Efficienty truncating a long string by dino

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.