Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl-Sensitive Sunglasses
 
PerlMonks  

Timing of garbage collection

by dd-b (Monk)
on Jan 18, 2013 at 18:03 UTC ( [id://1014100]=perlquestion: print w/replies, xml ) Need Help??

dd-b has asked for the wisdom of the Perl Monks concerning the following question:

As I understand it, object destruction happens in garbage collection, which is not guaranteed to happen at any particular time (before process termination). I must not count on actions in the destructor happening promptly when the last reference to an object goes away.

I've got a class that indexes members as they're created, and I need to keep that index clean because due to outside factors I don't control the same keys may come back fairly soon (outside forces prevent actual conflicts, but they do allow the same key to come up in close proximity in time).

So, I need to be sure that when I get rid of an object of this class, the cleanup happens immediately (before the next time I poll the outside forces for new work, which might give me new work with the same ID).

First, I put the cleanup in the destructor and just expected it to run. Then I realized that wasn't safe (due to the uncertain time of actual destruction), and changed to calling the destructor manually (I believe that's fairly safe, since this is a class used locally in a small controller program and I actually know when the objects aren't needed any more).

Then, after more thought, I added an instance method to take the current object out of the indexes, and just called that. (I also call it in the destructor; and because of that, it's become slightly more complicated and must handle the case where the object isn't currently in the indexes.)

(This maintaining an index is probably a recognized design pattern, but I'm not finding its name on a quick search. Sometimes factories do it, but this isn't a factory.)

So, just how undesirable is it to manually call object destructors? Is my current solution reasonably respectable? Or should I be using some completely different pattern, not trying to have the class index the class members for me or something? (Somebody has to index them.)

Replies are listed 'Best First'.
Re: Timing of garbage collection
by blue_cowdawg (Monsignor) on Jan 18, 2013 at 18:15 UTC
        So, just how undesirable is it to manually call object destructors?

    I heard Larry Wall talk about GC in Perl about eleven years ago and I'll paraphrase him here: "Something lives in memory in Perl until nobody cares."

    From what I remember there is a reference count associated with any variable (or object) and when that reference counter drops to zero the object is freed from memory.


    Peter L. Berghold -- Unix Professional
    Peter -at- Berghold -dot- Net; AOL IM redcowdawg Yahoo IM: blue_cowdawg

      Indeed...

      use strict; use warnings; use Test::More; use Scalar::Util qw(weaken isweak); use Devel::Refcount qw(refcount); my $A = bless {}, 'Monkey'; is(refcount($A), 1, 'an object has a refcount'); my $B = $A; is(refcount($A), 2, 'additional ref to the object increases the refcou +nt'); $B = undef; is(refcount($A), 1, 'removing a ref to the object decreases the refcou +nt'); do { my $C = $A; is(refcount($A), 2, 'reference in a scope increases refcount'); }; is(refcount($A), 1, '$C fell out of scope, so decreased ref count'); my $D = $A; is(refcount($A), 2, 'another ref which will increase ref count'); weaken $D; is(refcount($A), 1, 'weakening the ref means it is not included in the + ref count'); ok($A == $D, 'yet $A and $D are still the same object'); my $destroyed = 0; sub Monkey::DESTROY { $destroyed++ }; $A = undef; is($destroyed, 1, 'undefining $A destroyed the object; $D was a weak r +ef so did not prevent destruction'); ok(!defined $D, '$D is now undefined'); done_testing;
      perl -E'sub Monkey::do{say$_,for@_,do{($monkey=[caller(0)]->[3])=~s{::}{ }and$monkey}}"Monkey say"->Monkey::do'
      /tt); my $B = $A; is(refcount($A), 2,
Re: Timing of garbage collection
by muba (Priest) on Jan 18, 2013 at 20:40 UTC

    As long as you keep references around to your objects, you'll be able to access them through those references, wherever they may live. Could be deep down some data structure, or maybe even in a closure — as long as parts of your code still have access to those references, those parts of the code will have access to the objects.

    Ergo, as soon as an object is ready for garbage collection (i.e., you no longer have references pointing to them), no part of your code will be able to access those objects (I'm cutting corners here, because there's still the thing of weak references, but that's irrelevant to the discussion).

    So if Outside Force throws an object index at you that has been used before and it is meant to represent the same object as before, then I hope you still have a reference to that object. However, if the non-new object index is a new object, and you don't have a reference to the old object around any more, then there's no problem at all. Whether or not perl has already garbage collected the old object, you won't be able to access it any more anyway.

    So in the end, it's all your responsibility.

    • Keep references to those objects that you will need again later, and you'll be able to get to those objects;
    • Remove references to the objects that you're done with. Then, even when the Outside Force reuses an old index, you'll still be forced to create a fresh new object because as far as your code is concerned, the old object isn't anywhere to be found any longer
Re: Timing of garbage collection
by dave_the_m (Monsignor) on Jan 18, 2013 at 21:50 UTC
    As I understand it, object destruction happens in garbage collection, which is not guaranteed to happen at any particular time (before process termination). I must not count on actions in the destructor happening promptly when the last reference to an object goes away.
    No, perl uses reference counting rather than mark-and-sweep style garbage collection, so an object is guaranteed to be destroyed (and any destructor called) immediately its reference count goes to zero, not some random time later.

    Dave.

Re: Timing of garbage collection
by 7stud (Deacon) on Jan 19, 2013 at 06:29 UTC

    In "Intermediate Perl 2nd Ed., p. 54", Randal Schwartz helps propagate the apparent myth that perl does not release memory back to the operating system:

    Perl recycles the memory for the array only when all references (including the name of the array) go away. Here, Perl only reclaims memory when @skipper and all the references we created to it disappear. Such memory is available to Perl for other data later in this program invocation, and generally Perl doesn’t give it back to the operating system.

Re: Timing of garbage collection
by RichardK (Parson) on Jan 18, 2013 at 18:16 UTC

    What sort of perl objects are you using?

Re: Timing of garbage collection
by 7stud (Deacon) on Jan 18, 2013 at 19:57 UTC

    I've got a class that indexes members as they're created, and I need to keep that index clean because due to outside factors I don't control the same keys may come back fairly soon (outside forces prevent actual conflicts, but they do allow the same key to come up in close proximity in time).

    Huh?

Re: Timing of garbage collection
by Marshall (Canon) on Jan 19, 2013 at 00:09 UTC
    Perl doesn't have "garbage collection" in the sense that it never gives memory back to the OS.

    Once Perl has memory from the OS, it has it forever.
    A long-lived Perl app will reach a maximum memory usage and just stay at that number forever (provided no memory leaks).

    If you un-define an object ref: $obj_a = undef;
    Or say $obj_a = Method_X->new(...), reuse/reassign an object ref that has the same effect as long as no reference to an in internal structure within obj_a is in existence.

    Perl has its own memory management and reuses memory when it can.

      Perl doesn't have "garbage collection" in the sense that it never gives memory back to the OS
      Sorry, but that statement is garbage.

      First, the statement is wrong, as already demonstrated by BrowserUk. Whether perl releases memory back to the OS or not depends simply on the implementation of malloc/realloc, as described at Re: Not able to release memory (malloc implementation).

      Second, whether it returns memory to OS is not related to garbage collection! As already noted by dave_the_m, perl uses a reference-counted garbage collector, so you get "deterministic destructors" for free (i.e. in perl, you are guaranteed that an object is destroyed (and destructor called) immediately its reference count goes to zero). BTW, deterministic destructors are a feature of the C++ RAII idiom yet are problematic when using a mark-and-sweep garbage collector, such as that used by Java, which is why Java has a "finally" clause (see also Dispose pattern).

      Perl doesn't have "garbage collection" in the sense that it never gives memory back to the OS.

      That's demonstrably not exactly true:

      C:\test>perl -E"say `tasklist|find \"$$\"`; $x=chr(0); $x x= 2e6; say +`tasklist|find \"$$\"`; undef $x; say `tasklist|find \"$$\"`" perl.exe 139252 Console 1 4 +,660 K perl.exe 139252 Console 1 6 +,680 K perl.exe 139252 Console 1 4 +,724 K

      That shows that perl allocating a 2MB scalar and then returning that 2MB back to the OS.

      On my Perl/system, the break point for the size of allocations that are released back to the system is 1040351 bytes. anything more and it is; less and it is not:

      C:\test>perl -E"say `tasklist|find \"$$\"`; $x=chr(0); $x x= 1040352; +say `tasklist|find \"$$\"`; undef $x; say `tasklist|find \"$$\"`" perl.exe 129340 Console 1 4 +,688 K perl.exe 129340 Console 1 5 +,788 K perl.exe 129340 Console 1 4 +,780 K C:\test>perl -E"say `tasklist|find \"$$\"`; $x=chr(0); $x x= 1040351; +say `tasklist|find \"$$\"`; undef $x; say `tasklist|find \"$$\"`" perl.exe 241476 Console 1 4 +,704 K perl.exe 241476 Console 1 5 +,788 K perl.exe 241476 Console 1 5 +,792 K

      That number is a around 8k less than 1MB, so presumably it is 1MB internally but then some is used for internal management.


      With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
      Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
      "Science is about questioning the status quo. Questioning authority".
      In the absence of evidence, opinion is indistinguishable from prejudice.
        What version of Perl are you running? I am still on Ver 5.10.1 with Active State.
Re: Timing of garbage collection
by sundialsvc4 (Abbot) on Jan 18, 2013 at 22:19 UTC

    When I read this post, I zeroed in on:   So, “I need to be sure that when I get rid of an object of this class, the cleanup happens immediately.”   And my strong recommendation is:   “find a way not to have to care about that.”

    Reading further, I see that the core problem is that, to this application, “keys” are actually not “temporally unique” at all.   As the OP says, “the same key may come back fairly soon.”   The application obviously has no control over this:   the key, in other words, whatever it may be, is originating from the business.

    When something like this happens ... and it does happen a lot in the real world ... I normally solve the problem by treating these “external” keys as surrogate keys.   I do not use them to “uniquely” identify the records that the application must store, knowing that (for my technical purposes) they might not be unique.   Instead, I generate truly-unique, internal-only keys for the records that I need to store, and I map the business-provided keys to them.

    Very importantly:   it may well be that there is not a “one-to-one relationship” between the business-provided keys and my internal, known to be really-unique, mappings.   (Should be ... “always” swear-to god-and hope-to die should be ... but there ain’t.)   And so, ever so much more importantly, in spite of this “inconvenient truth,” my application nevertheless did not fall-down and burn in some kind of “I never thought this could happen to me” heap.   The business might have screwed-up, but I’m still standin’.   “Priceless.”

    Business-provided identifiers do all sorts of weird things in real life, because human beings are responsible for every bit of it.   Well, humans can deal with that sort of thing; computers can’t.   So, you have to find a way that the computer can, because in any argument between the computer and a human, the computer is the one that has to budge.

    Even when designing the internal storage architecture, I try to avoid having many references-to the same “storage object.”   Instead, I assign each object a (application-generated, not disclosed to anyone) random truly-unique key, and refer to it elsewhere using that key.   Yes, that involves an additional hash-lookup.

    If the business’s way of handling “an identifier” is posing problems to your application ... don’t try to fancy-pants program around it:   instead, map what you are given to something else (which you give to no one) which does meet the computer’s requirements.   Whatever you do, stay on your feet.

Re: Timing of garbage collection
by Anonymous Monk on Jan 19, 2013 at 18:37 UTC
    Also, regardless of the "memory size" reported, unused virtual-memory pages will wander out to the backing-store and stay there, courtesy of the operating system, and the real page-frames will be used by someone else who needs them.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://1014100]
Front-paged by Corion
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others avoiding work at the Monastery: (3)
As of 2024-04-19 21:59 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found