I've been following Using the strict module in object oriented programming and this reply Re: Using the strict module in object oriented programming suggested using arrayref based objects. I started up a long drawn out reply in there before deciding it was sufficiently off topic and merited another thread. So here we are.
Way back in the olden days of my OO perl programming, I used array based objects. They're faster (I recall benchmarking an arrayref vs hashref based object a few years back and finding that the arrayref was 33% faster), or at least they were back when I looked into it.
The standard headache with array based objects is that the "attributes" are less useful than hashes, as such:
$obj->[0] = 'foo'; $obj->[1] = 'bar'; $obj->[2] = 'baz'; vs $obj->{'foo'} = 'foo'; $obj->{'bar'} = 'bar'; $obj->{'baz'} = 'baz';
The hashref is much prettier. A standard clever approach is to define some sort of constant or variable with a prettier lookup value, along these lines:
package Class::Of::Obj; my $foo_idx = 0; my $bar_idx = 1; my $baz_idx = 2; $obj->[$foo_idx] = 'foo'; $obj->[$bar_idx] = 'bar'; $obj->[$baz_idx] = 'baz';
Much prettier than needing to remember all of those indexes all over the place. Of course, if you want to access the arrayref directly from outside of your class, you'd need to export out your constants (or scalars or whatever) to anything using your module that wants to directly access. That's messy, at best. A very good solution here is to use accessors and mutators from outside the class (you should be doing that anyway!) so the user doesn't need to worry about your constant indexes. You just use those internally yourself.
And this works all fine and dandy, fairly efficiently, fairly easily, and without your user needing to care (assuming you've wrappered everything in accessors, of course). But there's a gotcha you're bound to run into sooner or later - subclasses. Even worse, multiple subclasses.
Observe some scenarios:
package Super::Class; my $foo_idx = 0; my $bar_idx = 1; my $baz_idx = 2; package Sub::Class; # err...now what? How do I start using the next open index? # Everything's hardwired up there and the Super::Class # isn't telling me. I'm stuck.
You can look in the Super::Class and determine its last index and hardwire your values to start after that, but then your code will break if the Super::Class adds in a new index. The Super::Class ends up marching straight into your first index and clobbering it. Very bad.
So now you cleverly move into using some sort of incrementer function, ala this scenario from the original post I was looking at:
package Super::Class; my $idx = 0; sub NEXT_IDX { return $idx++ }; use constant FOO_IDX => NEXT_IDX(); use constant BAR_IDX => FOO_IDX + 1; use constant BAZ_IDX => BAZ_IDX + 2; package Sub::Class; # need to wrap in BEGIN blocks so SUPER:: works in the use at # compile time. BEGIN {our @ISA = qw(Super::Class)}; use constant OTHER_IDX => __PACKAGE__>SUPER::NEXT_IDX(); # oops! NEXT_IDX() is still set to '0' up there, and we just set # OTHER_IDX to 1, clobbering whatever's in the BAR_IDX field!
Okay, so the problem here is that the NEXT_IDX() value is only being called once, and BAR_IDX and BAZ_IDX are ignoring it. That's easy to fix.
package Super::Class; use constant FOO_IDX => NEXT_IDX(); # 0 use constant BAR_IDX => NEXT_IDX(); # 1 use constant BAZ_IDX => NEXT_IDX(); # 2 package Sub::Class; # need to wrap in BEGIN blocks so SUPER:: works in the use at # compile time. BEGIN {our @ISA = qw(Super::Class)}; use constant OTHER_IDX => __PACKAGE__>SUPER::NEXT_IDX(); #now at 3, after BAZ_IDX
The hassle that appears next is what happens when you have multiple subclasses?
package Super::Class; use constant FOO_IDX => NEXT_IDX(); # 0 use constant BAR_IDX => NEXT_IDX(); # 1 use constant BAZ_IDX => NEXT_IDX(); # 2 package Sub::Class; # need to wrap in BEGIN blocks so SUPER:: works in the use at # compile time. BEGIN {our @ISA = qw(Super::Class)}; use constant OTHER_IDX => __PACKAGE__>SUPER::NEXT_IDX(); #now at 3, after BAZ_IDX package Other::Sub::Class; BEGIN {our @ISA = qw(Super::Class)}; use constant DIFFERENT_IDX => __PACKAGE__>SUPER::NEXT_IDX(); #now at 4, after OTHER_IDX
This has two concerns that I'm aware of:
Whoops. Now you can't deserialize your object without making sure that your classes are loaded in the same order they were when your object was serialized.package Super::Class; use constant FOO_IDX => NEXT_IDX(); # 0 use constant BAR_IDX => NEXT_IDX(); # 1 use constant BAZ_IDX => NEXT_IDX(); # 2 package Other::Sub::Class; # need to wrap in BEGIN blocks so SUPER:: works in the use at # compile time. BEGIN {our @ISA = qw(Super::Class)}; use constant DIFFERENT_IDX => __PACKAGE__>SUPER::NEXT_IDX(); #now at 3, DIFFERENT THAN LAST TIME package Sub::Class; BEGIN {our @ISA = qw(Super::Class)}; use constant OTHER_IDX => __PACKAGE__>SUPER::NEXT_IDX(); #now at 4, DIFFERENT THAN LAST TIME!
But it gets bad when you have bigger objects. What if your super class has 25 slots? Then Sub::Class has 25 slots? Then Other::Sub::Class has 25 slots? Then Additional::Sub::Class has another 25 slots? Additional::Sub::Class there has 50 blank entries sitting around gobbling up memory needlessly.
Note that this is only a problem for the later classes. The super class still only uses 25 slots (nothing wasted), the Sub::Class still only uses 50 slots (nothing wasted), it's only when you get to Other::Sub::Class that it has 75 slots, 25 of which are wasted.
Eek. So what do we do now? Well, there are some ways we can try and fix it. ideally, we'd like our subclass's indexes to be independent of any other subclass's index. We can try something fancy like maintaining multiple indexes depending upon the package.
package Super::Class; my $class_indexes = {}; sub NEXT_IDX { my $class = shift; #if we have an index for this class, then return it if (defined $class_indexes->{$class}) { return ++$class_indexes->{$class}; } else { no strict 'refs'; my @isa = @{$class . "::ISA"}; #root class has no super my $idx = @isa ? $isa[0]->CURR_IDX() : -1; $class_indexes->{$class} = $idx + 1; return $class_indexes->{$class}; } } sub CURR_IDX { my $class = shift; return $class_indexes->{$class}; } use constant FOO_IDX => __PACKAGE__->NEXT_IDX(); # 0 use constant BAR_IDX => __PACKAGE__->NEXT_IDX(); # 1 use constant BAZ_IDX => __PACKAGE__->NEXT_IDX(); # 2 package Other::Sub::Class; BEGIN {our @ISA = qw(Super::Class)}; use constant DIFFERENT_IDX => __PACKAGE__->SUPER::NEXT_IDX(); #set to 3 package Sub::Class; BEGIN {our @ISA = qw(Super::Class)}; use constant OTHER_IDX => __PACKAGE__->SUPER::NEXT_IDX(); #also set to 3
Damn that's a lot of work! We keep separate counters for each subclass and increment them independently. To add a new index to our class, we need to lookup our index and increment it, or, if we don't have one yet, we look to our superclass and increment there.
All wonderful in theory, but it still doesn't work. This approach introduces additional problems, even.
You now once again have the problem of the Super::Class adding in a new attribute index later in the day and stomping into our index space. Note that we didn't have that issue with the single global NEXT_IDX() incrementer.
And multiple inheritance completely destroys it.
package Distant::Sub::Class; our @ISA = qw(Sub::Class Other::Sub::Class); #both OTHER_IDX and DIFFERENT_IDX point to index 3. Whoops.
You could try looping through all of your parent classes and finding the highest index and basing off of there, but then you end up with the empty slot issue. And your super classes still stomp all over your internals if they add a new attribute later in the day.
A slick alternative is to give each class its own slot and increment there. This will fix the issue of the Super::Class later adding new attributes and help alleviate the problem with empty slots draining memory.
package Super::Class; # now store a global class_idx in addition to the individual # indexes on each class my $class_idx = 0; sub NEXT_CLASS_IDX { return $class_idx++ }; my $class_indexes = {}; sub NEXT_IDX { my $class = shift; return $class_indexes->{$class}++; } use constant CLASS_IDX => __PACKAGE__->NEXT_CLASS_IDX(); use constant FOO_IDX => __PACKAGE__->NEXT_IDX(); # 0 use constant BAR_IDX => __PACKAGE__->NEXT_IDX(); # 1 use constant BAZ_IDX => __PACKAGE__->NEXT_IDX(); # 2 package Other::Sub::Class; BEGIN {our @ISA = qw(Super::Class)}; use constant CLASS_IDX => __PACKAGE__->NEXT_CLASS_IDX(); use constant DIFFERENT_IDX => __PACKAGE__->SUPER::NEXT_IDX(); #set to 0 package Sub::Class; BEGIN {our @ISA = qw(Super::Class)}; use constant CLASS_IDX => __PACKAGE__->NEXT_CLASS_IDX(); use constant OTHER_IDX => __PACKAGE__->SUPER::NEXT_IDX(); #also set to 0
You now access your slots with a double lookup:
my $obj = Sub::Class->new(); $obj->[CLASS_IDX]->[OTHER_IDX];
Of course, that CLASS_IDX constant is only available inside your own namespace, and if you need to export, you'd have to rename so everyone doesn't stomp all over each other. This fixes the multiple inheritance issue, since OTHER_IDX is hanging off of Sub::Class's slot, and DIFFERENT_IDX is hanging off of Other::Sub::Class's slot. So even though they're both 0, they're in different places. You really need to use accessors with this approach, though, unless you want to keep track of all of the class constants for all of your superclasses.
We don't need to worry about looking at our super class's attribute indexes, since we'll never stomp on them. So the NEXT_IDX code is greatly simplified to just be a counter on our particular class.
You still have the problem with empty slots, but you only end up with empty slots for each additional subclass that exists, not every attribute of each additional subclass. And you also can't reliably serialize, due to load order issues.
If you have complete control, you can theoretically add in a pure constant to each subclass. So Sub::Class has CLASS_IDX 1 and that's it. Nothing else can do it. That'll work if the code is only internal, but will break if it escapes into the wild. If Widget Corp releases Super::Class, then Frobnoz Corp can release Sub::Class (which they've hardwired to CLASS_IDX 1), then Foo, Inc. releases Other::Sub::Class (which they've also hardwired to CLASS_IDX 1), then you'll have problems if you try to use those two subclasses at once.
There may be solutions to these issues to allow you to continue using arrays. Or heck, they may not be enough of a concern to you. Me? I never solved these problems. Instead I just took the speed penalty and switched everything to hashrefs and stopped worrying about it.
In reply to Problems I've had with array based objects by jimt
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |