in reply to Re: Inside-out classes structure
in thread Inside-out classes structure

I don't see the point of an extra level of indirection.

I'm ignoring the fact that it should use refaddr instead of norm and that it uses a hashref for $data instead of just %data, as those are implementation details and not really about the data structure philosophy. The point is that for N properties, the object data are kept in one HoH data structure instead of N hashes. Whether this is a benefit (that outweighs the cost) depends.

Consider each of the following:

Initialization: given a hashref of initialization properties, the N-hashes way requires copying over each parameter. The HoH way can use the hashref that was passed in directly (storing it or copying it as appropriate).

(Caveat: my examples are to illustrate the point -- this is not necessarily good/robust style as it doesn't validate parameters, assumes that no bogus initializers are passed, etc.)

# N-hashes code adapted from docs to Class::Std sub BUILD { my ($self, $obj_ID, $arg_ref) = @_; $name{$obj_ID} = $arg_ref->{name} || $default_of{name}; $rank{$obj_ID} = $arg_ref->{rank} || $default_of{rank}; # repeat for N properties } # HoH way sub BUILD { my ($self, $obj_ID, $arg_ref) = @_; $data{$obj_ID} = { %default_of, %$arg_ref }; }

Destruction: The same issue arises during DESTROY. From some of my benchmarking, explicit destruction of each entry in the data-storage hash is one of the most inefficient parts of the inside-out technique. That requires either N calls to delete for N properties, or a single delete of the object hashref in the %data hash.

Thread-cloning: Inside-out objects are only thread-safe under 5.8 with the use of the CLONE subroutine. From my article Threads and fork and CLONE, oh my!), a subroutine like this is needed:

sub CLONE { # fix-up all object ids in the new thread for my $old_id ( keys %REGISTRY ) { # look under old_id to find the new, cloned reference my $object = $REGISTRY{ $old_id }; my $new_id = refaddr $object; # relocate data $NAME{ $new_id } = $NAME{ $old_id }; delete $NAME{ $old_id }; # repeat relocation for N properties # update the weak reference to the new, cloned object weaken ( $REGISTRY{ $new_id } = $REGISTRY{ $old_id } ); delete $REGISTRY{ $old_id }; } return; }

Using the HoH approach, only the %data hash would need to be relocated, rather than N individual hash entries.

All this might beg the question "why not use regular hash-based objects, then?" The answer, of course, is that the inside-out technique works on any type of blessed reference. For example, either of the N-hashes or HoH approach could be used to provide object properties to (or subclass) a class that is implemented as a blessed globref.

Benchmarking:

Just for fun, I tried some benchmarking of these approaches using very simple "create/destroy" cycles of these two styles. One of the biggest drivers of efficiency is whether the HoH-style class actually needs to copy the initialization parameters from the hashref, or whether the hashref passed to new can be directly stored.

Directly stored:

Rate NHashObject HoHObject NHashObject 51694/s -- -43% HoHObject 90761/s 76% --

Store a shallow copy of the initialization hashref:

Rate NHashObject HoHObject NHashObject 50557/s -- -4% HoHObject 52932/s 5% --

This examples above had N equal to 6 properties. For N = 3, the advantage is reversed:

Rate HoHObject NHashObject HoHObject 60857/s -- -4% NHashObject 63625/s 5% --

So, as the number of properties increases, HoH style may offer efficiency gains -- though in practice, I suspect this may depend on how much property validation happens, as that may well swamp these small differences.

-xdg

Code written by xdg and posted on PerlMonks is public domain. It is provided as is with no warranties, express or implied, of any kind. Posted code may not have been tested. Use of posted code is at your own risk.

Replies are listed 'Best First'.
Re^3: Inside-out classes structure
by rir (Vicar) on Oct 02, 2005 at 20:51 UTC
    You raise good points. I haven't thought so hard about the issue as you. I have an interest as I am looking to abandon my usage of hash based objects. I approached sh1tn's post from a don't reinvent the wheel stance.

    Not copying the parameter hash seems to leave the state of our new object open to the whims of another reference. Our capsule is dissolved before it is created.

    I have used hash based objects a lot, in my situations the trampling and encapsulation issues have not been a problem. But constraining the keys of the hash is something I quickly found useful. TheDamian's code lets strict give that to us by replacing strings with lexicals (I like that). I would appreciate your solution for HOHObjects or your other thoughts on this aspect of the matter.

    Class::Std's prime purpose is to constrain clients to the interface. That makes me wonder for what purpose you would use something other than a scalar as an object here; to me it seems to expose implementation to save only one indirection.

    Be well,
    rir

      Not copying the parameter hash seems to leave the state of our new object open to the whims of another reference. Our capsule is dissolved before it is created.

      I did mention that it wasn't necessarily best practice. I'd only recommend this if (a) efficiency was a primary concern and (b) I control the source of the hashref. Outside a custom hand-rolled class, it's really not a great idea.

      I would appreciate your solution for HOHObjects or your other thoughts on this aspect of the matter.

      I think it depends on the usage you intend. With traditional hash-based objects, people get their "strictness" with accessors. That approach still works with an HOH-inside-out object. It preserves the other benefit of inside-out objects -- subclassing anything. Another approach is the one that I'm experimenting with with Object::LocalVars, using a wrapper around methods to alias variables in file-scope to the right object-specific data. With that approach, I can implement with any underlying data structure. (Note, I'm not saying that anyone should do this -- merely that TIMTOWDI is still operative.)

      Class::Std's prime purpose is to constrain clients to the interface. That makes me wonder for what purpose you would use something other than a scalar as an object here

      Well, with all due respect to TheDamian, Class::Std is just one particular implementation of the inside-out paradigm. It has embodies design choices that make it better for some applications than others. From what I can tell, it's designed to facilitating creating large, complex, internally-consistent, extensible class hierarchies. That means it does other things less well.

      As an example of why one might use something other than a scalar, the classic example is when you want to subclass a module that someone else wrote on CPAN. You want to add functionality and additional state, but you don't want to tightly couple your subclass to the parent class's implementation. (What if it changes? What if they add keys that collide with yours?) With the inside-out approach, it doesn't matter -- you construct the parent object, rebless it to your class, and then initialize your own state. It's completely orthogonal to the parent implementation.

      Another example would be when you want the object to be used like some other type of data than just an object. For example, you want a filehandle that you can print to, but that you can call methods and also save state. With traditional objects, if you want state, you can't use a blessed globref as the object. You can tie a class to a filehandle, of course, but then you have all the tie overhead. Instead, with an inside-out object, you bless the filehandle reference as the object, and store the state externally just like for any other inside-out object. Again, state is orthogonal to the type of blessed object.

      -xdg

      Code written by xdg and posted on PerlMonks is public domain. It is provided as is with no warranties, express or implied, of any kind. Posted code may not have been tested. Use of posted code is at your own risk.