One of the common causes of this in Perl is if you've read your shared numeric data in from a file.

When you assign that data to the scalars in the array, the data is still stored as strings: '123'.

But then you come to use those values as a part of a numeric expression, and perl converts the string stored in the PV slot of the scalar and assigns the binary numeric value to the IV or NV slot.

Bang! You just made a non-mutating reference to a single shared value and causes a 4k page to be copied. Iterate you're entire array summing the numbers and you'll cause the whole array, plus everything else on each page that contains any of your arrays scalars to be copied also. A prime example of halo slippage on the "threads are spelt f-o-r-k" holy COW.

A practical tip: If your shared arrays are numeric and read from files or a DB, add zero to them as you assign them:

my @sharedArray = map 0+$_, split ' ', $lineOfData;;

Not only will you not cause COW when doing math with them after forking, the array will be smaller to boot. Adding zero forces the conversion of the string you read into a binary numeric before it is assigned to the SV, which means no PV will be allocated and you save space. And as they are already numeric, using them in a numeric context won't have to convert them and so no mutations of the SV and no COW.

Of course, that only holds true until you use them in a string context. If you need to print the out to another file, or the terminal, use printf instead of print and interpolation.

my @a = 1.. 1e6;; ## takes 62 MB. printf "the number is: %d\n", $_ for @a;; ## causes no memory growth print "the number is: $_\n" for @a;; ## Causes the memory to gro +w to 110MB.

Use interpolation on shared data and the growth would be far higher because unless you are very careful in how you populate the original array, the scalars it consists of will occupy space in 4k pages shared with other data, and they'll be copied also.

Do it in 2 or more forked children and they'll all get their own copies. 100MB of shared numeric data and 5 forked children and you can see the total memory requirement blossom to well over 1GB just cos you interpolated the numbers.


Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.
"Too many [] have been sedated by an oppressive environment of political correctness and risk aversion."

In reply to Re: WHY copying does happen (fork) by BrowserUk
in thread WHY copying does happen (fork) by Anonymous Monk

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.