in reply to Perl Garbage Collection, again

It is possible to demonstrate (as you have), that if you request a single large chunk of ram from the C-runtime allocator (malloc() etc), that it will make a separate virtual memory allocation (VirtualAlloc() or platform equivalent) specifically for that request, rather than use or extend the existing heapspace.

And when the CRT free() is called on that allocation, it will in turn call the OS VirtualFree() (or equivalent), which will return the virtual memory pages backing the allocation, back to the OS.

As you see some RAM is given back to the OS (~50%, (99364/197020))

If you consider those numbers carefully, you have 197020*1024 = 201748480 bytes of virtual memory before the undef and 99364*1024 = 101748736 afterward. Meaning that you freed 99,999,744 bytes of space. Which corresponds (give or take the rounding up to 1024 byte pages), the 100,000,000 byte single, extraordinary allocation you made.

Whilst you can make use of this knowledge to cause single huge chunks of ram, (the break point seems to be ~1/2MB or greater using MS CRT (via AS Perl) on my system), it isn't as useful as you might think.

The majority of large volumes of data manipulated in Perl are allocated in arrays or hashes. And whilst these both have a single, largish, contiguous compenent at their core, (the array of pointers), the vast majority of the space they occupy when populated, is allocated in lots of smallish allocations (the SVs Hes and HEKs). These will always be allocated from separate virtual allocations (heaps) and intermingled with other SVs etc. And those virtual allocations will never be released back to the OS until every allocation within them has been freed. Which is unlikely to ever happen.

So, if you're manipulating large chunks of contiguous data, basically big scalars, within your program, and you can cause these to be allocated in one go (as with your 'X' x 1e8; there are other ways also), then you can cause them to be freed back to the OS (rather than just the process memory pool) when you are done with them.

But, the conditions under which that free back to the OS happens are unwritten, vague, and probably vary widely with platform, compiler, CRT version, and maybe even compiler and linker options. It's not something that you can easily codify enough to rely upon.


Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.
"Too many [] have been sedated by an oppressive environment of political correctness and risk aversion."

Replies are listed 'Best First'.
Re^2: Perl Garbage Collection, again
by pkirsch (Novice) on Dec 31, 2008 at 12:11 UTC
    Helpful, indeed.
    What I recognized:
    - allocating a big chunk of memory at once is evil (for releasing it afterwards):
    use strict; $| = 1; print "$$\n"; #top -p $$ print "Test, Allocating a large string \n"; <>; { my $foo = 'X' x 100000000; print "Large String allocated.\n";<>; undef $foo; print "Large String deallocated.\n";<>; } print "2nd Large String.\n";<>; { #evil: my $foo2 = 'X' x 100000000; my $foo2; $foo2 .= 'x' x 1000 for (1 .. 100000); print "2nd Large String allocated.\n";<>; undef $foo2; print "2nd Large String deallocated.\n";<>; } print "Now what? Press enter to exit"; <>;
    As you can see the memory used for $foo2 is returned to the OS. So my initial example was not correct.

    May I ask you another question, which puzzles me:
    The above example also shows that at the end of the of script there are still 97m allocated (due to $foo). Also, the variable $foo2 does not use the previous allocated chunks of $foo (because the usage also rises up to total usage 192m.
    Is the caused by the overhead of the internal memory handling of perl?
    Thanks,

      Sorry for the delay. I got into something else and overlooked this. Try this version of your code:

      use strict; $| = 1; print "$$\n"; #top -p $$ print "Test, Allocating a large string \n"; <>; { my $foo = 'X'; $foo x= 100000000; print "Large String allocated.\n";<>; undef $foo; print "Large String deallocated.\n";<>; } print "2nd Large String.\n";<>; { #evil: my $foo2 = 'X' x 100000000; my $foo2; $foo2 .= 'x' x 100_000 for (1 .. 1000); print "2nd Large String allocated.\n";<>; undef $foo2; print "2nd Large String deallocated.\n";<>; } print "Now what? Press enter to exit"; <>;

      When that reaches the "Now what" prompt, you should find that the memory usage has return to the same level as you had at statrtup with all the large allocations now returned to the OS.

      The main change is my $foo = 'X'; $foo x= 100_000_000;, rather than my $foo = 'X' x 100_000_000;.

      With the latter version, the big string is constructed on the stack, and then assigned to the scalar $foo, with the result that it makes two large memory allocations, one of which never gets cleaned up.

      With the former version, that double allocation is avoided and the memory is cleaned up properly.

      Note. The minor change to the second loop 1_000x100_000 rather than 100_000x1_000 makes no difference to the outcome, I just got bored waiting for the loop to run :)

      The duplication of the allocation using the second method isn't an error, but rather a side effect of the way the code is parsed and executed. The fact that it doesn't get cleaned up properly could be construed as a bug--or not. You'd have to raise the issue with p5p and take their view on the matter.


      Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
      "Science is about questioning the status quo. Questioning authority".
      In the absence of evidence, opinion is indistinguishable from prejudice.