On a 32-bit system, there is an approx 32 byte overhead per string (not including the string itself). Also, if, you create a list (eg with split), then eg assign it to an array, perl may temporarily need two copies of each string (plus extra space for the large temporary stack). After the assignment the temp copy will be freed for perl to reuse, but not freed to the OS (so VM usage won't shrink). Given that Devel::Size itself has a large overhead, what you are seeing looks reasonable. Consider the following code:
my $content = decode('UTF-8', 'tralala ' x 1E6);
my @a;
$#a = 10_000_000; # presize array
for (1..5)
{
print "ITER $_\n";
push @a, split m{(\p{Z}|\p{IsSpace}|\p{P})}ms, $content;
procinfo();
}
which on my system gives the following output:
ITER 1
Vsize: 248.18 MiB ( 260235264)
RSS : 62362 pages
ITER 2
Vsize: 317.14 MiB ( 332550144)
RSS : 80000 pages
ITER 3
Vsize: 393.71 MiB ( 412839936)
RSS : 99598 pages
ITER 4
Vsize: 579.46 MiB ( 607612928)
RSS : 147156 pages
ITER 5
Vsize: 625.23 MiB ( 655597568)
RSS : 158895 pages
which averages about 94Mb growth per iteration, or 47 bytes per string pushed onto @a; allowing 32 bytes string overhead per string (SV and PV structures), leaves 15 bytes per string, which allowing for trailing \0, rounding up to a multiple of 4, malloc overhead etc etc, looks reasonable.
Dave.
Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
Read Where should I post X? if you're not absolutely sure you're posting in the right place.
Please read these before you post! —
Posts may use any of the Perl Monks Approved HTML tags:
- a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
| |
For: |
|
Use: |
| & | | & |
| < | | < |
| > | | > |
| [ | | [ |
| ] | | ] |
Link using PerlMonks shortcuts! What shortcuts can I use for linking?
See Writeup Formatting Tips and other pages linked from there for more info.