richardwfrancis has asked for the wisdom of the Perl Monks concerning the following question:

Dear Monks

I wish to be enlightened by your collective wisdom once more, please

I've tried to simplify the problematic area of my code below

The trouble I am experiencing is that between the loops, even though I am resetting my hashref, I don't seem to be freeing up the memory it occupied.

If you run the code while looking at "top" in Linux or some equivalent, you'll see that after the first loop, even though the hashref is reset on line 89, there's still a fair chunk in memory

I've no idea what I'm doing wrong here. PLEASE can someone tell me what I'm doing wrong.

I've commented the code to show you what it is doing and note that this code will take about 25seconds (cos of the sleeps) and use up about 4GB RAM. If you need it to use less of both then adjust the sleeps and decrease the values on lines 12 and 13

MANY thanks in advance for any advice
Kind regards,
Rich

#!/usr/bin/perl use strict; use Devel::Size qw(total_size); # creating a reference to an anonymous hash my $coverage = {}; # I'm going to run a loop twice with these values my @times = qw(once twice); # some data to use to fill an array in the sub my %lengths = ( "id1" => 50000000, "id2" => 50000000, ); # keeping track of the estimated memory size of the coverage hashref my $total_size = total_size($coverage); print "size before the loops = $total_size\n"; # OK run the sub twice with different parameters foreach my $t (@times){ if ($t eq "once"){ # I'm doing stuff here that's conditional on $t # then running the sub &doWork($t); } elsif ($t eq "twice"){ # I'm doing stuff here that's conditional on $t # then running the sub &doWork($t); } # keeping track of the estimated memory size of the coverage hashref my $total_size = total_size($coverage); print "size after $t sub = $total_size\n"; # reset the hashref so that memory should be small again %{ $coverage } = (); # keeping track of the estimated memory size of the coverage hashref my $total_size = total_size($coverage); print "size after resetting the hashref = $total_size\n"; # while total_size reports that the hashref is small, top shows that +the program is still hogging memory # what am I doing wrong???? print "what's the mem doing?\n"; sleep(10); } sub doWork { my $time = shift; # keeping track of the estimated memory size of the coverage hashref my $total_size = total_size($coverage); print "size at the start of $time = $total_size\n"; # the main work in the sub is to fill a large reference to an array foreach my $l (keys %lengths){ print "creating $l\n"; push @{ $coverage->{$l} }, 0 for(1 .. $lengths{$l}); } # keeping track of the estimated memory size of the coverage hashref my $total_size = total_size($coverage); print "size after cov made in $time = $total_size\n"; sleep(5); # then the sub goes off and does some work that gives back some coordi +nates to use to increment particular values in the array created abov +e # get some data to work with my %increments = { "id1" => qw(400-500,6000-6500,100000-100500), "id2" => qw(400-500,6000-6500,100000-100500) }; # then the values in the array are incremented and passed back to the +main program to do something with it print "incrementing\n"; foreach my $i (keys %increments){ foreach my $p (@{ $increments{$i} }){ my ($s,$e) = split("-",$p); $coverage->{$i}->[$_]++ for ($s - 1 .. $e - 1); } } # keeping track of the estimated memory size of the coverage hashref my $total_size = total_size($coverage); print "size after increments made in $time = $total_size\n"; sleep(5); }

Replies are listed 'Best First'.
Re: Unexplained memory hogging
by RichardK (Parson) on Aug 22, 2014 at 17:12 UTC

    Perl has it's own memory manager that runs on top of the operating systems one, and you can not guess when (or even if!) perl will release memory back to the OS. It depends on lots of things like the OS and even the build options selected when perl was compiled. see Mini-Tutorial: Perl's Memory Management for some details.

Re: Unexplained memory hogging
by Laurent_R (Canon) on Aug 22, 2014 at 17:45 UTC
    Generally (but not always, this is OS dependent and it also depends on compile options and other factors), even is some large chunk of memory is "freed" when a large data structure is reset or goes out of scope, that memory will not be returned to the OS, but it will be available internally to your Perl program for another large data structure.
Re: Unexplained memory hogging
by zentara (Cardinal) on Aug 22, 2014 at 18:12 UTC
    In addition to what has been said above, memory usage will stay at whatever peak level your program has used. There are a couple of tricks you can use to minimize the gain, like reusing variable names so as not to incur more memory allocation. Generally, if you expect memory gain problems, design your program to fork off the heavy work, and when the fork finishes, all memory is returned to the system.

    I don't quite understand why Perl can't incorporate some malloc and demalloc methods to allow forcing of memory release within a program. Surely it could be done. It's so easy in C.


    I'm not really a human, but I play one on earth.
    Old Perl Programmer Haiku ................... flash japh
      99.9% of the time, I am very very very happy that Perl is managing memory for me, letting arrays or hashes grow transparently without myself having to take care of it.

      I certainly do not regret the time when I had to do myself mallocs, callocs, reallocs, frees or memsets each time I wanted to use dynamic memory allocation. Well, to tell the truth, I am still using C once in a while, and that helps me knowing how happy I am using Perl instead of C most of the time. No problems with null or dangling pointers, no memory leak (except for special cases such as circular or reciprocal references), no core dump or segmentation fault (or almost never), no out of bound array, and so on and so forth, gosh, Perl is so much nicer than C.

      No, I really disagree with you. If Perl were to introduce mallocs and its siblings, I would certainly go back to other dynamic languages I have been using before Perl (TCL, Python) or straight to newer ones such as Ruby and others.

      Besides, I don't remember for sure and I haven't tried recently and I don't really have time right now to test, but I am not really sure that a free in a C program freeing some memory returns it to the OS. I would think that there are some OSes where it is the case, but probably not the majority of them. But I may be wrong on this last point, I don't remember having tested extensively, I usually did not have any serious data size problem at the time I was using C intensively.

        but I am not really sure that a free in a C program freeing some memory returns it to the OS.

        I stand corrected. In C too, there is no guarantee memory will be returned to the system and make the program smaller.

        From gnu libc free :

        "Occasionally, free can actually return memory to the operating system and make the process smaller. Usually, all it can do is allow a later call to malloc to reuse the space. In the meantime, the space remains in your program as part of a free-list used internally by malloc."

        So it seems that it is not just an interpreted language problem.


        I'm not really a human, but I play one on earth.
        Old Perl Programmer Haiku ................... flash japh
        The releases you mention in those links are very specific, and generally cannot be counted on in writing a Perl script. The key word in those links is "sometimes", which is about as reliable as weather prediction. :-)

        I'm not really a human, but I play one on earth.
        Old Perl Programmer Haiku ................... flash japh

      Thank you everyone for your replies and comments. It has been an interesting discussion.

      I actually used zentara's suggestion of forking the large memory section so that once the process ended, the memory was returned. This works well and as I was already using Parallel::ForkManager elsewhere in the code I only had to make a couple of changes to the code so thank you!

      For those interested, in addition to following Anonymous Monk's advice and passing the $coverage hashref as an argument to the sub, the other changes I made to get this working as I want were as follows:

      • Initially adding the use of Parallel::ForkManager and configuring a max of 1 fork
      • use Parallel::ForkManager; my $fm = Parallel::ForkManager->new(1);
      • Inside the foreach loop identify the code to be run within the fork and initiate the fork
      • foreach my $t (@times){ $fm->start and next;
      • At the end of the loop, identify the end of the code within the fork and finish the fork. Outside the loop, block the main process until the forked process has finished.
      • print "what's the mem doing?\n"; sleep(10); $fm->finish(); } $fm->wait_all_children();

      That's it. The forked process releases the memory when it finishes and the next process starts afresh.

      Cheers,
      Rich

Re: Unexplained memory hogging
by Anonymous Monk on Aug 22, 2014 at 23:34 UTC