vernonlyon has asked for the wisdom of the Perl Monks concerning the following question:

How do I (if it's even possible) intercept any changes to the REFCNT of an object?

When you create a reference to an object the object's REFCNT increases by 1. When you lose a reference to an object, then it's REFCNT decreases by 1.

use B; my $ref = {}; # get a reference to our object (a hash) print B::svref_2object($ref)->REFCNT; # prints 1 my $copy = $ref; print B::svref_2object($ref)->REFCNT; # prints 2 undef $copy; print B::svref_2object($ref)->REFCNT; # prints 1 again

I would like to be able to "intercept" when these changes occur. Is this possible? I've been looking through the perl source, perlguts, perlapi ... As of yet though I can't find a way to do this.

So I now turn to the Monks, please help...

Thanx,
Vernon

Replies are listed 'Best First'.
Re: Intercept any changes to the sv_refcnt/SvREFCNT of an object
by roboticus (Chancellor) on Sep 13, 2011 at 12:55 UTC

    vernonlyon:

    Have you looked at Devel::Monitor, Devel::WeakRef or Hash::NoRef? A brief CPAN search turned them up, and they may give you some clues as how you can monitor reference counts, or perhaps suggest an alternative to what you're trying to do.

    ...roboticus

    When your only tool is a hammer, all problems look like your thumb.

      I've looked at those modules and a few others, but they don't touch on what I'm trying to achieve.
      I'd like to "hook" into when an object's REFCNT goes up or down anywhere in the program.

      The closest I could find was Variable::Magic, but it hooks into a variable not the thing it's referencing and it doesn't have hooks for REFCNTs.

      Thanx, though.

        As incrementing/decrementing the reference count happens (very) often, these are implemented as C macros and not hookable function calls in sv.c:

        #if defined(__GNUC__) && !defined(PERL_GCC_BRACE_GROUPS_FORBIDDEN) # define SvREFCNT_inc(sv) \ ({ \ SV * const _sv = MUTABLE_SV(sv); \ if (_sv) \ (SvREFCNT(_sv))++; \ _sv; \ }) # define SvREFCNT_inc_simple(sv) \ ({ \ if (sv) \ (SvREFCNT(sv))++; \ MUTABLE_SV(sv); \ }) # define SvREFCNT_inc_NN(sv) \ ({ \ SV * const _sv = MUTABLE_SV(sv); \ SvREFCNT(_sv)++; \ _sv; \ }) # define SvREFCNT_inc_void(sv) \ ({ \ SV * const _sv = MUTABLE_SV(sv); \ if (_sv) \ (void)(SvREFCNT(_sv)++); \ }) #else # define SvREFCNT_inc(sv) \ ((PL_Sv=MUTABLE_SV(sv)) ? (++(SvREFCNT(PL_Sv)),PL_Sv) : NULL) # define SvREFCNT_inc_simple(sv) \ ((sv) ? (SvREFCNT(sv)++,MUTABLE_SV(sv)) : NULL) # define SvREFCNT_inc_NN(sv) \ (PL_Sv=MUTABLE_SV(sv),++(SvREFCNT(PL_Sv)),PL_Sv) # define SvREFCNT_inc_void(sv) \ (void)((PL_Sv=MUTABLE_SV(sv)) ? ++(SvREFCNT(PL_Sv)) : 0) #endif

        The "easy" way is to just redefine these macros and recompile Perl. You should be aware that invoking Perl code from within such a hook would be very unwise, as Perl code inevitably will allocate values whose refcount needs to be incremented, which would invoke your hook, recursively.

        This is also the hard way.

        Clone::Fast has programatic hooks that could be integrated into B. Give that a try.
Re: Intercept any changes to the sv_refcnt/SvREFCNT of an object
by chromatic (Archbishop) on Sep 13, 2011 at 17:53 UTC

    If you can't recompile Perl and all of your extensions, you can't do this.

    What are you trying to do that you need these notifications?


    Improve your skills with Modern Perl: the free book.

Re: Intercept any changes to the sv_refcnt/SvREFCNT of an object
by ikegami (Patriarch) on Sep 13, 2011 at 17:53 UTC

    Can't, short of making changes to the Perl source.

    #define SvREFCNT(sv) (sv)->sv_refcnt #define SvREFCNT_inc(sv) ((PL_Sv=MUTABLE_SV(sv)) ? (++(SvREFCNT(PL_Sv) +),PL_Sv) : NULL)

    As you can see,

    SvREFCNT_inc(sv)

    boils down to

    ++( (sv)->sv_refcnt )

    What's the bigger picture?

Re: Intercept any changes to the sv_refcnt/SvREFCNT of an object
by vernonlyon (Novice) on Sep 13, 2011 at 18:41 UTC

    OK, The bigger picture. Here goes ...

    It basically has to do with circular references.
    To deal with them we could use Scalar::Util::weaken or just cleanup the objects manually.
    Scalar::Util::weaken won't work for me because the code often refers to a "parent" object via a "child" and vice-versa. I'll illustrate:

    sub circ_ref { my $dad = {}; my $son = {}; $dad->{son} = $son; $son->{dad} = $dad; Scalar::Util::weaken $son->{dad}; return $dad; } my $dad = circ_ref(); # All is well my $son = circ_ref()->{son}; # Doh! Who's your daddy? print Data::Dumper::Dumper $dad, $son;

    So instead I have to cleanup the objects manually, but at what point? The references to either "parent" or "child" may hang around for while. Only when they have both gone can I cleanup the circ-ref. To do this I need to know when the REFCNTs have dropped to 1.

    So I was trying to find a way to "hook" SvREFCNT_dec for these objects, but I guess this is just not possible.
    Is there another way to do the cleanup?

      Several times I've solved this type of problem by having an outer wrapper class that is what the user of the module deals with. That gets destroyed when the user is done with it because its ref-count goes to 0.

      The outer wrapper is just a pointer to the inner object. The inner object can create circular references with other inner objects. When a wrapper is destroyed, it tells the inner object to break any potential circular references so that the inner object(s) will also be destroyed.

      This is pretty simple when you have a "container" and a bunch of contained things and the outer wrapper for the container going away means that you can destroy all of the inner, contained things. But you hint at a more complex situation. What if you want to have full life cycles for all of your objects instead of having to have "contained" objects whose life is bounded by some container's life?

      It doesn't have to be terribly complicated to pull that off. You can have outer wrappers that tell you when external users are done using an inner object (that is, you implement your own version of "ref counting").

      Then you want the wrapped (inner) objects to be destroyed when they can't be reached even indirectly by any outer wrapper. That is, you implement your own garbage collector for the inner objects.

      I wanted to see how tricky this really was, and not just in theory. (It also may serve as an excellent example for some other stuff I am working on.) So I threw together a quick, full, working implementation.

      First, I created a couple of hand-rolled roles. My::Outer is a role for the wrappers and My::Inner is a role for the inner, wrapped objects. And I implemented the roles using nothing more complicated than Exporter.pm.

      This means that the My::Parent and My::Child classes were extremely simple. They just declare themselves as playing the My::Outer role for My::Parent::Inner and My::Child::Inner, respectively. I also made them overload stringification just to make tracing simpler:

      package My::Parent; use My::Outer qw< ::Inner new DESTROY AUTOLOAD >; use overload '""' => \&GetName; package My::Child; use My::Outer qw< ::Inner new DESTROY AUTOLOAD >; use overload '""' => \&GetName;

      Yes, that is the whole implementation of both classes.

      The '::Inner' means "append '::Inner' to my class name to get the class name that I wrap". 'new' says that a new() method gets exported into the class (to be the constructor that the module's user calls). 'DESTROY' exports a DESTROY() method so that reference counts can be decremented. 'AUTOLOAD' exports an AUTOLOAD() method that automatically deligates so that any public methods (no leading underscore in the name) defined on the inner object can be called directly on the wrapper.

      Here is the full implementation of the My::Inner role.

      (Late update: Commented out (with "##") useless ->{wrapper} parts left over from early debugging when I was using objects before they were fully constructed. Then I made the role not require the inner objects be blessed refs while still not requiring the writing of two tiny methods if the inner objects are blessed refs. This improvement demonstrates the flexibility provided by having the using module explicitly list what methods it wants to import from the role. Though, the main reason I did that was so that the using class is fully documented, preventing the reader from having to read up on every used role in order to figure out where a particular method came from or just to know all of the methods the using class actually provides. New lines are marked by "#+" below.)

      You can see $in->{refs}++ and --$in->{refs} which is how simple ref-counting is to implement. The whole role (two subs) and the machinery for constructing roles (two lines) is quite simple.

      Now we can implement 'my inner child'. I used the following naming convention for telling inner/outer parent/child apart:

      # Outer objects: dad son These are wrappers given to other +s # | | just to track what is still in-us +e. # v v # Inner objects: _dad <------ _son These have the real data, includi +ng # `------------^ links to other (inner) objects.

      To make garbage collection simple, all links between inner objects are two-way. That way a single object can tell what reference cycles it is a part of without having to have some global registry of all inner objects. To make this example simple, I just went with "a Dad can have at most one Son and a Son can have at most one Dad".

      package My::Child::Inner; use My::Inner qw< _wrap _unwrap _incRefs _decRefs >; sub _new { my( $class, $dad, @args )= @_; # Replace this with your real constructor! my $_son= { name => 'Boy', @args }; bless $_son, $class; $$dad->_adopt( $_son ) if $dad; return $_son; } sub GetName { my( $_son )= @_; return $_son->{name}; } sub GetDad { my( $_son )= @_; my $_dad= $_son->{dad}; return undef if ! $_dad; # We must wrap objects returned from public methods! return $_dad->_wrap(); } # For debugging: sub DESTROY { my( $_son )= @_; warn "DESTROYing son: $_son->{name}\n"; }

      And here is how simple it can be to implement a garbage collector:

      # Called when Son is no longer externally referenced: sub _free { my( $_son )= @_; my $_dad= $_son->{dad}; if( ! $_dad || ! $_dad->{refs} ) { # Son can die (and take Dad with him) # if there is no Dad (or Dad is also unreferenced): $_son->{dad}= $_dad->{son}= undef; # Break ref cycles! } }

      And here is a quick test of the whole thing with tracing:

      { my $dad= My::Parent->new( name => 'Sr' ); warn "\$dad=$dad\n"; my $son= My::Child->new( $dad, name => 'Jr' ); warn "\$dad=$dad -> \$son=$son\n"; $dad= My::Parent->new( name => 'Newt' ); warn "\$dad=$dad; Sr -> \$son=$son\n"; warn "Sr no longer referenced, but not destroyed yet.\n"; $dad->Adopt( $son ); warn "\$dad=$dad -> \$son=$son; (Sr destroyed)\n"; My::Child->new( $dad, name => 'Young' ); warn "Young never really referenced, but not destroyed yet.\n"; warn "\$dad=$dad -> Young; \$son=$son\n"; $dad= My::Parent->new( name => 'Fin' ); warn "\$dad=$dad; \$son=$son (Newt -> Young destroyed)\n"; warn "Rest to be destroyed next.\n"; } warn "Everything destroyed above.\n";

      Which produces the following output:

      $dad=Sr $dad=Sr -> $son=Jr $dad=Newt; Sr -> $son=Jr Sr no longer referenced, but not destroyed yet. DESTROYing dad: Sr $dad=Newt -> $son=Jr; (Sr destroyed) Young never really referenced, but not destroyed yet. $dad=Newt -> Young; $son=Jr DESTROYing son: Young DESTROYing dad: Newt $dad=Fin; $son=Jr (Newt -> Young destroyed) Rest to be destroyed next. DESTROYing son: Jr DESTROYing dad: Fin Everything destroyed above.

      Note that "Newt -> Young destroyed" shows a circular reference getting destroyed as soon as it can no longer be reached externally.

      The full code is included inside of CODE tags inside an HTML comment so you can use the "Select" link to download the last CODE block if you want to download or just view the full, working example (Update: Or just get the full code, now that I know this node's ID). (Update: 2 tiny changes to full code.)

      - tye        

        This is by far the best method to deal with this situation.
        Using wrapper objects had occurred to me before, but wasn't used due to worries about performance.
        But it turns out very light outer objects can be used with minimal impact on performance.

        Thanx tye.

      It's hard to tell if this could be applied to your application, but the easiest way to deal with the circular references problem is to avoid it.

      Instead of storing references to other objects within your objects, store your objects in an array, and then store indexes to the parent/child within the object.

      package Family; my @objs; sub new { my( $parent, $child, %otherstuff ) = @_; push @objs, my $self = bless \%otherstuff, __PACKAGE__; $self->id( $#objs ); $self->parent( $parent->id ); $self->child( $child->id ); return $self; }

      It's one extra level of indirection, but no circular references.


      Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
      "Science is about questioning the status quo. Questioning authority".
      In the absence of evidence, opinion is indistinguishable from prejudice.

      As an alternative to BrowserUk's approach of using an array, I've used a Foo::Graph object that holds all entities (nodes, edges) currently relevant. Each node and edge holds weak references to the connected nodes and edges, and it is responsibility of the user to hold onto the Foo::Graph object as long as any Foo::Edge or Foo::Node object is used somewhere. This is also not ideal, but might give you another idea.