in reply to Memory usage breakup

Perl doesn't return memory that is freed after a variable/object etc. is destroyed.

It does if you undef the variable. If you have a long-running program that sucks large amounts of data into a scalar at any point, it's a good idea to undef the scalar when you finish with it (at least it is if you are having memory problems).

Replies are listed 'Best First'.
Re: Re: Memory usage breakup
by sgifford (Prior) on May 01, 2004 at 04:15 UTC
    Interestingly, my copy of Perl seems to return about half of the memory when you undef a variable, at least on Linux:
    sub showmem { system("cat /proc/$$/status |grep '^Vm' |sed -e 's/\$/ ($_[0])/'"); print "\n" } showmem("start"); my $x = "A" x 10_000_000; showmem("allocated"); undef $x; showmem("freed");
    produces:
    VmSize:	    2808 kB (start)
    VmLck:	       0 kB (start)
    VmRSS:	    1144 kB (start)
    VmData:	     240 kB (start)
    VmStk:	      28 kB (start)
    VmExe:	     684 kB (start)
    VmLib:	    1528 kB (start)
    
    VmSize:	   22348 kB (allocated)
    VmLck:	       0 kB (allocated)
    VmRSS:	   20700 kB (allocated)
    VmData:	   19780 kB (allocated)
    VmStk:	      28 kB (allocated)
    VmExe:	     684 kB (allocated)
    VmLib:	    1528 kB (allocated)
    
    VmSize:	   12580 kB (freed)
    VmLck:	       0 kB (freed)
    VmRSS:	   10936 kB (freed)
    VmData:	   10012 kB (freed)
    VmStk:	      28 kB (freed)
    VmExe:	     684 kB (freed)
    VmLib:	    1528 kB (freed)
    

      Tying down how much memory a given program will use, and what if any of that memory will be recycled, either internally by perl or back to the OS, is extremely complicated. (As well as highly OS / perl -V / individual perl build depenedant.)

      For example, these 2 one-liners

      P:\test>perl -e" { for( 1 .. 100_000 ) { $x[ $_ ] = ' ' x 1000; $x[ $_ ] = undef; } <STDIN>; } <STDIN>;"

      In this first example, each element of the 100_000 element global array @x is allocated a 1000-byte value, which is then immediately 'freed' by undefing it. At the end of the loop, (the first prompt), 100+MB is allocated to the process. The space for 100_000 elements of 1000-bytes + the overhead for perls array and scalar structures. Even though only 1 element of the array has any space allocated (usable) at any given time.

      P:\test>perl -e" { my @x; for( 1 .. 100_000 ) { $x[ $_ ] = ' ' x 1000; $x[ $_ ] = undef; } <STDIN>; } <STDIN>;"

      The same program, except that the array is now locally scoped. When the first prompt is reached after the loop completes, again, 100+MB is being used, meaning that 99_999 elements of discarded (undef'd) space are lying around unusable and unused. However, once the second prompt is reached, ie. after the local scope in which @x was defined has exited, the memory used by the process (on my system) drops to 12MB.

      With care and motivation, it is possible to force perl to re-use discarded memory, (and even return some of it to the OS under win32), but every attempt I've made to formulate a strategy for doing either, has fallen on stoney ground. I can do it on a case-by-case basis for many apps. I have begun to recognise some cases where I am reasonably sure that I can optimise the memory requirements through fairly simple steps, but inevitably, there are always exceptions to the rules of thumb I use.

      Unfortunately, the exceptions are too common to make the rules of thumb viable for anything other than cases of extreme need.


      Examine what is said, not who speaks.
      "Efficiency is intelligent laziness." -David Dunham
      "Think for yourself!" - Abigail
        $var = undef and undef $var behave differently in respect to memory deallocation. The first never seems to free memory, while the second does instantly. BrowserUk's code uses about 136MB on my system, while the bellow never grows past 4MB on my system. (Win98/AS809)
        { for( 1 .. 100_000 ) { $x[ $_ ] = ' ' x 1000; undef($x[ $_ ]); } <STDIN>; } <STDIN>;
        Interesting. In Unix, you can allocate memory using an anonymous mmap, then return it to the system with munmap. You can do that by hand if you want, packing data into the allocated region, then unpacking it back out when you need it.

        This is horribly inconvenient, of course. I wonder if an XS module could use mmap to allocate a big chunk of memory and instruct Perl's allocator to use it, then later free this memory and tell Perl to use the normal allocator. I don't know anything about Perl internals, so I shouldn't be speculating like this, but it's so much fun. :)

      Actually, to the best of my knowledge it goes back into the pool that perl allocates from. I'm surprised that any of it appears to be freed back to the OS.

      If you'd like to find out more about it, I remember a post on here by elian talking about the internals of this process.

      Update: as pointed out by sgifford, the repeat operator isn't getting constant-folded, so most of this is wrong - it is the pad entry for the subexpression mentioned in the last two paragraphs that is grabbing the memory.
      End update

      The 10MB string in $x is freed by the undef $x, but there is another copy of the string attached to the optree (the compiled code), since constant folding will replace

      my $x = "A" x 10_000_000;
      with
      my $x = "AAAAA..[10_000_000 long]..";

      You can test this by putting the code in an anonymous sub, and then freeing it:

      our $x; my $code = sub { $x = "A" x 10_000_000 }; showmem("start"); &$code; showmem("allocated"); undef $x; showmem("var freed"); undef $code; showmem("code freed");

      Running that here shows VMSize at each step of:

      start: 2896 kB
      allocated: 22436 kB
      var freed: 12668 kB
      code freed: 2900 kB
      

      However, I feel that it is quite rare to have big constants like this in real code, so the simplistic approach of "fold anything that's constant" is still probably the right thing for perl to do.

      Unfortunately you cannot sidestep this just by replacing the constants with variables, since then a different aspect kicks in: perl's compiler assigns an entry on the pad for the results of subexpressions, and this intermediate target then holds the extra copy of the string.

      I'm not sure whether any of the pad-peeking CPAN modules show these unnamed pad entries, but the -DX debugging flag will help at least to verify their existence, eg:

      perl -DXt -we 'my $l = 100; my $a = "A" x $l;'

      Hugo

        Huh. I just tried your code and got the same results. I find that surprising; I would expect constant folding to happen at compile time, and so for the memory to be allocated at the start step.

      Just to chime in with my results. I don't know what version of perl sgifford was using, but I'm running perl 5.8.2 under Debian Linux. My results are:

      My system seems to start with half a meg more RAM used (556K), but the percentage freed seems similar (42.7% for me vs. 43.7% for sgifford).

      Does this imply a memory leak somewhere in perl, or that the garbage collector is caching the memory allocated for re-allocation later?

Re: Re: Memory usage breakup
by shonorio (Hermit) on May 01, 2004 at 10:55 UTC
    Even if I use 'my' on variables ?

    Solli Moreira Honorio
    Sao Paulo - Brazil
Re: Re: Memory usage breakup
by shonorio (Hermit) on May 01, 2004 at 11:23 UTC
    If I use 'my' on variable ?

    Solli Moreira Honorio
    Sao Paulo - Brazil
      Yes, even then.
        Oh my good !!! I'll must start to use 'undef' too. I've many perl system running as services on Windows, and i've been looking for some things stranger for a services running for a long time.

        Solli Moreira Honorio
        Sao Paulo - Brazil