Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

<background> I have a perl process that does a lot of memory intensive stuff, then undefs all the big hashes and calls an external, somewhat memory-hungry binary. Trouble is, just undefing the hashes doesn't free nearly the memory it should. The only thing I can think of is that since perl doesn't immediately need that ram, it thinks it's okay to keep going without collecting. </background>

Is there a gentle way to ask perl to kindly collect garbage *now*?

Many thanks,
Grem

Replies are listed 'Best First'.
Re: forced garbage collection
by merlyn (Sage) on Nov 30, 2000 at 06:18 UTC
      Excellent... I seem to remember something about the fact of memory not getting returned to the OS somewhere... (all that system stuff is rotting away in some back corner of my mind...) Also seems to at least confirm that I wasn't completely out of my mind about how undef works.

      Leave it to merlyn to point out the CORRECT answer.

      cephas

      time to recreate the indexes on my brain...
Re: forced garbage collection
by mwp (Hermit) on Nov 30, 2000 at 05:53 UTC
    Not that I know of. Try using delete instead of undef on the hash keys.

    Update:
    Alternately, locally scope the hash in a code block. When the block ends, Perl should reclaim the name space and therefore the memory. IIRC =)

    If you need a final version of the hash data to be globally available, put all the hash ops in a subroutine and use the return statement to return just what you need. Likewise to the above, when the subroutine ends, Perl will reclaim all that memory. Just don't return a reference to the data structure, then you will have a closure of sorts, which will defeat the entire purpose!

    Friar 'kaboo

Re (tilly) 1: forced garbage collection
by tilly (Archbishop) on Nov 30, 2000 at 07:06 UTC
    If you are on a *nix system then I suggest trying not worrying about it and see if it is all pushed to swap.

    Two other possible solutions. The first is to tie some of your big hashes to DB_File. This is a minor code change but it will slow down access to the hash elements. However then they will be on disk, and not in RAM. The second is to investigate moving the part that needs access to the large hashes to an external script.

Re: forced garbage collection
by quidity (Pilgrim) on Nov 30, 2000 at 08:13 UTC

    Once perl has used some system memory for something, you'll be damm lucky if you get it back for other processes to use until perl exits. undef'ing a variable will return the memory used by that variable to perl, so it won't need to eat any more memory the next time it needs some.

    This is one of the reasons why eating a whole file into memory is a bad thing, even if you throw it away soon after doing so.

    If avoiding memory bloat is important, then you'll need to design an algorithm which doesn't use much in memory storage. Working on small chunks of data and saving the results to a file, then at the end of your program taking the saved results and putting them together into your final answer. Also consider using DB_File, although remember that disk access is much slower then RAM access.

      One way (under almost every operating system -- AmigaOS being a notable (almost) exception) to return allocated memory so that other processes can make use of it is to exit or exec. Etc.

      So an alternate solution is to have your memory hog restart itself by execing itself. To do this, you'll probably need to persist some state from the old to the new instance of the script. Sometimes this is as simple as passing some command-line parameters to the new instance. Other times you might have to serialize some data structures to a temporary file to be restored by the new instance. Or you can even start the new instance and serialize your state down a pipe to it before the old instance exits.

      Oh, and, just to stress the point, Perl doesn't have a traditional garbage collector. There is nothing to trigger. "Garbage" is "collected" when the number of references to it goes to zero (except for "temporary" garbage which may not be collected until roughly the end of the statement). undef does remove references and so will "collect" "garbage" unless you've made other references to that garbage.

              - tye (but my friends call me "Tye")
Re: forced garbage collection
by cephas (Pilgrim) on Nov 30, 2000 at 06:21 UTC
    Unless I'm badly mistaken, doing 'undef %hash;' will clear out the hash table, and deallocate it in one fatal swoop (/me does a lot of guessing and waves his hands around a bunch), probably via a call to Perl_hv_undef somewhere. But I do know that perl has trouble deallocating circular data structures properly unless you explicitly break the circle.

    cephas