cLive ;-) has asked for the wisdom of the Perl Monks concerning the following question:

Hi All,

I'm now working on a daemon and am trying to improve garbage collection. Until now, I was under the impression that collection was completely automatic, but this quick test says otherwise (on Linux):

#!/usr/bin/perl use strict; use warnings; show_size(); { my @var=(0..1000000); show_size(); } show_size(); exit(0); sub show_size { local $/; open(my $pfh, '<', "/proc/$$/status") || die $!; my $size = <$pfh> =~ /VmSize:\s+(\d+)/ ? $1 : 'unknown'; close($pfh); print "Process size: $size\n"; } # Output is # Process size: 49956 # Process size: 53864 # Process size: 53864

But, if I add in some explicit undefs, the result changes:

#!/usr/bin/perl use strict; use warnings; show_size(); { my @var=(0..1000000); show_size(); undef @var; } show_size(); exit(0); sub show_size { local $/; open(my $pfh, '<', "/proc/$$/status") || die $!; my $size = <$pfh> =~ /VmSize:\s+(\d+)/ ? $1 : 'unknown'; close($pfh); print "Process size: $size\n"; undef $pfh; undef $size; } # Output is # Process size: 49660 # Process size: 53568 # Process size: 49660

I was under the impression that when a variable went out of scope it would get cleaned up by Perl's automatic garbage collection. But if that's the case, why does the process size not go down after @var falls out of scope without the explicit undef? Am I misunderstanding how garbage collection works? A colleague says that Perl will free up the memory for re-use, but won't let it go. Is this how it really works?

Either way, what is the best practice for controlling memory use in a daemon setup? Any insights appreciated.

cLive ;-)

Replies are listed 'Best First'.
Re: Understanding garbage collection specifics...
by zentara (Cardinal) on Feb 01, 2007 at 11:42 UTC
    I'm not an expert on garbage collection, but I played around with it awhile ago, and found out this tidbit..... you also need to undef the array, before it goes out of scope. This works.
    #!/usr/bin/perl use strict; use warnings; show_size(); { my @var=(0..1000000); show_size(); # undef before it goes out of scope undef @var; } show_size(); exit(0); sub show_size { local $/; open(my $pfh, '<', "/proc/$$/status") || die $!; my $size = <$pfh> =~ /VmSize:\s+(\d+)/ ? $1 : 'unknown'; close($pfh); print "Process size: $size\n"; }

    Output:

    Process size: 47936 Process size: 51976 Process size: 48064
    As you can see, a bit of it is left over. Also it works in this simple example, but it may not in big complex scripts, or where objects are involved. </code>

    I'm not really a human, but I play one on earth. Cogito ergo sum a bum
      Thanks for paying attention to my original post :)
        Doh.... damn that Speed Reading Class !! :-)

        I'm not really a human, but I play one on earth. Cogito ergo sum a bum
Re: Understanding garbage collection specifics...
by Anonymous Monk on Feb 01, 2007 at 07:14 UTC
    Have you read the FAQ entry on this? Memory used and free-d by Perl is not released to the O/S (except in MacOS, if I recall correctly), so the apparent memory usage will not go down, but that memory is still available for Perl to re-use for other items.
    -- Randal L. Schwartz, Perl hacker
      I didn't write that recently. Had I written it recently, I would have said that any malloc that is smart enough to use mmap() instead of just sbrk() can return chunks back to the O/S. And apparently, that includes many current popular Unix-like operating systems.
      I knew that, but I was wondering why undef does free the memory and why Perl doesn't just do that automatically.
        As an optimisation, perl often doesn't completely clear lexical arrays on scope exit. It frees the elements of the array, but the block of memory it has to hold pointers to the 1E6 elements isn't freed. This is on the assumption that if you entered the block once and created a big array, you're likely to do so again. On the other hand, undef frees the pointer block too.

        Dave.

Re: Understanding garbage collection specifics...
by Anonymous Monk on Feb 01, 2007 at 07:17 UTC
Re: Understanding garbage collection specifics...
by mreece (Friar) on Feb 01, 2007 at 15:25 UTC
    some of the comments in this node may help understand what is going on with memory allocation and lexical variables.
Re: Understanding garbage collection specifics...
by zentara (Cardinal) on Feb 01, 2007 at 17:17 UTC
    Just for the sake of comparison, here is how the new Glib library makes memory collection easy. It's c, but maybe Perl6 will draw from Glib?
    #include <glib.h> #include <stdio.h> /* compile with gcc -o freetest freetest.c `pkg-config --cflags --libs glib-2.0` */ int main() { int i,c; int *array = g_new0(int, 20000000); printf("check mem declared, not filled\n"); c = getchar(); for(i=0; i< 20000000; i++){ array[i] = i; } printf("check mem, filled\n"); c = getchar(); g_free(array); printf("check mem, freed\n"); c = getchar(); return 0; }

    I'm not really a human, but I play one on earth. Cogito ergo sum a bum
Re: Understanding garbage collection specifics...
by Moron (Curate) on Feb 01, 2007 at 13:40 UTC
    As I understand it, garbage collection only takes place when all references to the "garbage" are removed.

    A daemon generally has to fork or something similar to handle new requests and this tends to ensure termination rather than collection.

    Otherwise declare request structures at a higher scope than the request servicing routines and simply remove them on completion - any working references at a lower level will not then prevent garbage collection.

    Update: alternatively, even easier: declare within the scope of the service routine and return blanco or only a status code to the higher scope on completion - no references will out-survive the handling in that case.

    -M

    Free your mind

Re: Understanding garbage collection specifics...
by dmitri (Priest) on Feb 05, 2007 at 23:07 UTC