Intercept any changes to the sv_refcnt/SvREFCNT of an object

vernonlyon has asked for the wisdom of the Perl Monks concerning the following question:

Replies are listed 'Best First'.
Re: Intercept any changes to the sv_refcnt/SvREFCNT of an object by roboticus (Chancellor) on Sep 13, 2011 at 12:55 UTC
vernonlyon: Have you looked at Devel::Monitor, Devel::WeakRef or Hash::NoRef? A brief CPAN search turned them up, and they may give you some clues as how you can monitor reference counts, or perhaps suggest an alternative to what you're trying to do. ...roboticus When your only tool is a hammer, all problems look like your thumb.	[reply]
Re^2: Intercept any changes to the sv_refcnt/SvREFCNT of an object by vernonlyon (Novice) on Sep 13, 2011 at 14:50 UTC
I've looked at those modules and a few others, but they don't touch on what I'm trying to achieve. I'd like to "hook" into when an object's REFCNT goes up or down anywhere in the program. The closest I could find was Variable::Magic, but it hooks into a variable not the thing it's referencing and it doesn't have hooks for REFCNTs. Thanx, though.	[reply]
Re^3: Intercept any changes to the sv_refcnt/SvREFCNT of an object by Corion (Patriarch) on Sep 13, 2011 at 15:16 UTC
As incrementing/decrementing the reference count happens (very) often, these are implemented as C macros and not hookable function calls in sv.c: #if defined(__GNUC__) && !defined(PERL_GCC_BRACE_GROUPS_FORBIDDEN) # define SvREFCNT_inc(sv) \ ({ \ SV * const _sv = MUTABLE_SV(sv); \ if (_sv) \ (SvREFCNT(_sv))++; \ _sv; \ }) # define SvREFCNT_inc_simple(sv) \ ({ \ if (sv) \ (SvREFCNT(sv))++; \ MUTABLE_SV(sv); \ }) # define SvREFCNT_inc_NN(sv) \ ({ \ SV * const _sv = MUTABLE_SV(sv); \ SvREFCNT(_sv)++; \ _sv; \ }) # define SvREFCNT_inc_void(sv) \ ({ \ SV * const _sv = MUTABLE_SV(sv); \ if (_sv) \ (void)(SvREFCNT(_sv)++); \ }) #else # define SvREFCNT_inc(sv) \ ((PL_Sv=MUTABLE_SV(sv)) ? (++(SvREFCNT(PL_Sv)),PL_Sv) : NULL) # define SvREFCNT_inc_simple(sv) \ ((sv) ? (SvREFCNT(sv)++,MUTABLE_SV(sv)) : NULL) # define SvREFCNT_inc_NN(sv) \ (PL_Sv=MUTABLE_SV(sv),++(SvREFCNT(PL_Sv)),PL_Sv) # define SvREFCNT_inc_void(sv) \ (void)((PL_Sv=MUTABLE_SV(sv)) ? ++(SvREFCNT(PL_Sv)) : 0) #endif [download] The "easy" way is to just redefine these macros and recompile Perl. You should be aware that invoking Perl code from within such a hook would be very unwise, as Perl code inevitably will allocate values whose refcount needs to be incremented, which would invoke your hook, recursively. This is also the hard way.	[reply] [d/l]
Re^4: Intercept any changes to the sv_refcnt/SvREFCNT of an object by vernonlyon (Novice) on Sep 13, 2011 at 16:50 UTC
Re^3: Intercept any changes to the sv_refcnt/SvREFCNT of an object by Khen1950fx (Canon) on Sep 13, 2011 at 15:20 UTC
Clone::Fast has programatic hooks that could be integrated into B. Give that a try.	[reply]
Re^4: Intercept any changes to the sv_refcnt/SvREFCNT of an object by vernonlyon (Novice) on Sep 13, 2011 at 16:54 UTC
Re^5: Intercept any changes to the sv_refcnt/SvREFCNT of an object by Khen1950fx (Canon) on Sep 14, 2011 at 03:00 UTC
Re: Intercept any changes to the sv_refcnt/SvREFCNT of an object by chromatic (Archbishop) on Sep 13, 2011 at 17:53 UTC
If you can't recompile Perl and all of your extensions, you can't do this. What are you trying to do that you need these notifications? Improve your skills with Modern Perl: the free book.	[reply]
Re: Intercept any changes to the sv_refcnt/SvREFCNT of an object by ikegami (Patriarch) on Sep 13, 2011 at 17:53 UTC
Can't, short of making changes to the Perl source. `#define SvREFCNT(sv) (sv)->sv_refcnt #define SvREFCNT_inc(sv) ((PL_Sv=MUTABLE_SV(sv)) ? (++(SvREFCNT(PL_Sv) +),PL_Sv) : NULL)` [download] As you can see, `SvREFCNT_inc(sv)` [download] boils down to `++( (sv)->sv_refcnt )` [download] What's the bigger picture?	[reply] [d/l] [select]
Re: Intercept any changes to the sv_refcnt/SvREFCNT of an object by vernonlyon (Novice) on Sep 13, 2011 at 18:41 UTC
OK, The bigger picture. Here goes ... It basically has to do with circular references. To deal with them we could use `Scalar::Util::weaken` or just cleanup the objects manually. `Scalar::Util::weaken` won't work for me because the code often refers to a "parent" object via a "child" and vice-versa. I'll illustrate: `sub circ_ref { my $dad = {}; my $son = {}; $dad->{son} = $son; $son->{dad} = $dad; Scalar::Util::weaken $son->{dad}; return $dad; } my $dad = circ_ref(); # All is well my $son = circ_ref()->{son}; # Doh! Who's your daddy? print Data::Dumper::Dumper $dad, $son;` [download] So instead I have to cleanup the objects manually, but at what point? The references to either "parent" or "child" may hang around for while. Only when they have both gone can I cleanup the circ-ref. To do this I need to know when the REFCNTs have dropped to 1. So I was trying to find a way to "hook" SvREFCNT_dec for these objects, but I guess this is just not possible. Is there another way to do the cleanup?	[reply] [d/l] [select]
Re^2: Handling circular references (not Intercept changes to SvREFCNT) by tye (Sage) on Sep 14, 2011 at 05:35 UTC
Several times I've solved this type of problem by having an outer wrapper class that is what the user of the module deals with. That gets destroyed when the user is done with it because its ref-count goes to 0. The outer wrapper is just a pointer to the inner object. The inner object can create circular references with other inner objects. When a wrapper is destroyed, it tells the inner object to break any potential circular references so that the inner object(s) will also be destroyed. This is pretty simple when you have a "container" and a bunch of contained things and the outer wrapper for the container going away means that you can destroy all of the inner, contained things. But you hint at a more complex situation. What if you want to have full life cycles for all of your objects instead of having to have "contained" objects whose life is bounded by some container's life? It doesn't have to be terribly complicated to pull that off. You can have outer wrappers that tell you when external users are done using an inner object (that is, you implement your own version of "ref counting"). Then you want the wrapped (inner) objects to be destroyed when they can't be reached even indirectly by any outer wrapper. That is, you implement your own garbage collector for the inner objects. Read more... (987 Bytes) I wanted to see how tricky this really was, and not just in theory. (It also may serve as an excellent example for some other stuff I am working on.) So I threw together a quick, full, working implementation. First, I created a couple of hand-rolled roles. My::Outer is a role for the wrappers and My::Inner is a role for the inner, wrapped objects. And I implemented the roles using nothing more complicated than Exporter.pm. This means that the My::Parent and My::Child classes were extremely simple. They just declare themselves as playing the My::Outer role for My::Parent::Inner and My::Child::Inner, respectively. I also made them overload stringification just to make tracing simpler: `package My::Parent; use My::Outer qw< ::Inner new DESTROY AUTOLOAD >; use overload '""' => \&GetName; package My::Child; use My::Outer qw< ::Inner new DESTROY AUTOLOAD >; use overload '""' => \&GetName;` [download] Yes, that is the whole implementation of both classes. The '::Inner' means "append '::Inner' to my class name to get the class name that I wrap". 'new' says that a new() method gets exported into the class (to be the constructor that the module's user calls). 'DESTROY' exports a DESTROY() method so that reference counts can be decremented. 'AUTOLOAD' exports an AUTOLOAD() method that automatically deligates so that any public methods (no leading underscore in the name) defined on the inner object can be called directly on the wrapper. Here is the full implementation of the My::Inner role. (Late update: Commented out (with "##") useless `->{wrapper}` parts left over from early debugging when I was using objects before they were fully constructed. Then I made the role not require the inner objects be blessed refs while still not requiring the writing of two tiny methods if the inner objects are blessed refs. This improvement demonstrates the flexibility provided by having the using module explicitly list what methods it wants to import from the role. Though, the main reason I did that was so that the using class is fully documented, preventing the reader from having to read up on every used role in order to figure out where a particular method came from or just to know all of the methods the using class actually provides. New lines are marked by "#+" below.) Read more... (2 kB) You can see `$in->{refs}++` and `--$in->{refs}` which is how simple ref-counting is to implement. The whole role (two subs) and the machinery for constructing roles (two lines) is quite simple. Now we can implement 'my inner child'. I used the following naming convention for telling inner/outer parent/child apart: # Outer objects: dad son These are wrappers given to other +s # \| \| just to track what is still in-us +e. # v v # Inner objects: _dad <------ _son These have the real data, includi +ng # `------------^ links to other (inner) objects. [download] To make garbage collection simple, all links between inner objects are two-way. That way a single object can tell what reference cycles it is a part of without having to have some global registry of all inner objects. To make this example simple, I just went with "a Dad can have at most one Son and a Son can have at most one Dad". package My::Child::Inner; use My::Inner qw< _wrap _unwrap _incRefs _decRefs >; sub _new { my( $class, $dad, @args )= @_; # Replace this with your real constructor! my $_son= { name => 'Boy', @args }; bless $_son, $class; $$dad->_adopt( $_son ) if $dad; return $_son; } sub GetName { my( $_son )= @_; return $_son->{name}; } sub GetDad { my( $_son )= @_; my $_dad= $_son->{dad}; return undef if ! $_dad; # We must wrap objects returned from public methods! return $_dad->_wrap(); } # For debugging: sub DESTROY { my( $_son )= @_; warn "DESTROYing son: $_son->{name}\n"; } [download] And here is how simple it can be to implement a garbage collector: `# Called when Son is no longer externally referenced: sub _free { my( $_son )= @_; my $_dad= $_son->{dad}; if( ! $_dad \|\| ! $_dad->{refs} ) { # Son can die (and take Dad with him) # if there is no Dad (or Dad is also unreferenced): $_son->{dad}= $_dad->{son}= undef; # Break ref cycles! } }` [download] Read more... (2 kB) And here is a quick test of the whole thing with tracing: { my $dad= My::Parent->new( name => 'Sr' ); warn "\$dad=$dad\n"; my $son= My::Child->new( $dad, name => 'Jr' ); warn "\$dad=$dad -> \$son=$son\n"; $dad= My::Parent->new( name => 'Newt' ); warn "\$dad=$dad; Sr -> \$son=$son\n"; warn "Sr no longer referenced, but not destroyed yet.\n"; $dad->Adopt( $son ); warn "\$dad=$dad -> \$son=$son; (Sr destroyed)\n"; My::Child->new( $dad, name => 'Young' ); warn "Young never really referenced, but not destroyed yet.\n"; warn "\$dad=$dad -> Young; \$son=$son\n"; $dad= My::Parent->new( name => 'Fin' ); warn "\$dad=$dad; \$son=$son (Newt -> Young destroyed)\n"; warn "Rest to be destroyed next.\n"; } warn "Everything destroyed above.\n"; [download] Which produces the following output: `$dad=Sr $dad=Sr -> $son=Jr $dad=Newt; Sr -> $son=Jr Sr no longer referenced, but not destroyed yet. DESTROYing dad: Sr $dad=Newt -> $son=Jr; (Sr destroyed) Young never really referenced, but not destroyed yet. $dad=Newt -> Young; $son=Jr DESTROYing son: Young DESTROYing dad: Newt $dad=Fin; $son=Jr (Newt -> Young destroyed) Rest to be destroyed next. DESTROYing son: Jr DESTROYing dad: Fin Everything destroyed above.` [download] Note that "Newt -> Young destroyed" shows a circular reference getting destroyed as soon as it can no longer be reached externally. The full code is included inside of CODE tags inside an HTML comment so you can use the "Select" link to download the last CODE block if you want to download or just view the full, working example (Update: Or just get the full code, now that I know this node's ID). (Update: 2 tiny changes to full code.) - tye	[reply] [d/l] [select]
Re^3: Handling circular references (not Intercept changes to SvREFCNT) by vernonlyon (Novice) on Oct 25, 2011 at 10:45 UTC
This is by far the best method to deal with this situation. Using wrapper objects had occurred to me before, but wasn't used due to worries about performance. But it turns out very light outer objects can be used with minimal impact on performance. Thanx tye.	[reply]
Re^2: Intercept any changes to the sv_refcnt/SvREFCNT of an object by BrowserUk (Patriarch) on Sep 13, 2011 at 19:21 UTC
It's hard to tell if this could be applied to your application, but the easiest way to deal with the circular references problem is to avoid it. Instead of storing references to other objects within your objects, store your objects in an array, and then store indexes to the parent/child within the object. `package Family; my @objs; sub new { my( $parent, $child, %otherstuff ) = @_; push @objs, my $self = bless \%otherstuff, __PACKAGE__; $self->id( $#objs ); $self->parent( $parent->id ); $self->child( $child->id ); return $self; }` [download] It's one extra level of indirection, but no circular references. Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error. "Science is about questioning the status quo. Questioning authority". In the absence of evidence, opinion is indistinguishable from prejudice.	[reply] [d/l]
Re^2: Intercept any changes to the sv_refcnt/SvREFCNT of an object by Corion (Patriarch) on Sep 13, 2011 at 20:32 UTC
As an alternative to BrowserUk's approach of using an array, I've used a `Foo::Graph` object that holds all entities (nodes, edges) currently relevant. Each node and edge holds weak references to the connected nodes and edges, and it is responsibility of the user to hold onto the `Foo::Graph` object as long as any `Foo::Edge` or `Foo::Node` object is used somewhere. This is also not ideal, but might give you another idea.	[reply] [d/l] [select]