Does one encapsulate a class from itself?

amarquis has asked for the wisdom of the Perl Monks concerning the following question:

This is probably a strange question, and will show how little OO work I've done. But, I've been thinking about it lately so here goes nothin':

I get the idea behind encapsulation, you provide a public interface of methods to get and set data so the user doesn't mess around under the hood.

But, what about in the module itself? Say I've a Perl module that takes in some data, stores it as an internal representation, has some manipulation methods, and finally outputs the data. Should those manipulation methods be aware of the underlaying representation, or should they too use getters and setters?

Specific example: I'm currently working on a module at $work to help with a task simliar to what I mention above (data in, maybe manipulation, data out in new format). The object itself is a blessed hash reference, and the main data gets stored in a 2-d array, a reference to which is in the hash.

Say I've a method that needs to find a cell in that matrix and modify it. Should it use $self->getcell($x, $y) or $self->{'dataref'}[$x][$y]?

On one hand, using the getters/setters provides the benefits of encapsulation inside the class. I'd be able to seamlessly change the data structure with little work. And were there things I wanted to track about each get and set, I could track them right in the getters/setters, knowing everything went through them.

On the other hand, I've got many, MANY operations to do, and adding a few million subroutine calls slows things down significantly (but not insufferably). This is the argument I've always allowed to sway me, and until I began playing around with this idea I always let my modules directly access the underlaying data structures.

This may very well be a dumb question in addition to a strange one, but I'm still curious as to what the prevailing wisdom is on this.

Comment on Does one encapsulate a class from itself? Select or Download Code

Replies are listed 'Best First'.
Re: Does one encapsulate a class from itself? by ikegami (Patriarch) on Mar 06, 2008 at 17:34 UTC
The two main reasons to use accessors is to provide a checkpoint where data can be validated before being used by the class, and to allow the internal format of the data to change without having to change the interface. Do you need those checks when you access the data internally? If the data format changes internally, will it be a hardship to change the other methods accordingly? The answer is usually no, removing the need to use accessors internally. Incidentally, recent experience has taught me that using an accessor internally can prevent a class from being inherited. A class I was using inherited from another class in order to override accessors in order to pre-process (encode/decode) the data being exchanged. This worked fine with the XS implementation of the base class, since it didn't use the accessor internally, but it failed with the PP implementation of the base class, since it used the accessor internally, causing the encoding/decoding to occur twice. This is more of a lesson in the need to use composition instead of inheritance rather than a lesson against the internal use of accessors, but inheritance is misused in this manner so often that it should be taken in to consideration. If you want to use accessors internally for whatever reason (say to reduce code duplication), it might be useful to have accessors for internal use and accessors for external use (`sub accessor { &_accessor }`).	[reply] [d/l]
Re: Does one encapsulate a class from itself? by herveus (Prior) on Mar 06, 2008 at 17:48 UTC
Howdy! I think your analysis is pretty well spot on. If you severely limit the place where your code relies on the underlying storage mechanism, you have less code to change when you decide to redo that mechanism. Changing from a "classic" hash-based object to an inside-out object may be a bit contrived, but adding a persistent backing store such as a database would be a plausible case. You might even go so far as to create a class to be the raw object distinct from the main class that is the public interface. yours, Michael	[reply]
Re: Does one encapsulate a class from itself? by kyle (Abbot) on Mar 06, 2008 at 18:44 UTC
I prefer to use the object's accessors even within the object itself. As you say, it provides the benefits of encapsulation inside the class. In particular, I can change the representation of some part of it without too much disturbance. If performance becomes a demonstrated issue, I might go back on that, but usually I spend a lot more time waiting for things outside my code than I do dragging my feet inside. YMMV.	[reply]
Re^2: Does one encapsulate a class from itself? by amarquis (Curate) on Mar 06, 2008 at 20:24 UTC
Side question: you linked profiling information, but I used benchmark, should I have gone with profiling instead? I reasoned that just knowing where time was spent wasn't valuable to me (since, for example, maybe more time spent in accessors was less time spent in subroutines. I know I could have subtracted one from the other, but it was seeming pretty approximate when I was thinking about it). What I did do was make up versions of important subs made to use accessors and benchmark, for example, iterations of a sub versus it's getter/setter-using counterpart. Seemed to do the trick, wish I'd left the modified subs around now, though, so I'd have something to share.	[reply]
Re^3: Does one encapsulate a class from itself? by kyle (Abbot) on Mar 06, 2008 at 23:19 UTC
There's not much point in optimizing something if it's only called once. Likewise, you may have something that's very fast but called a million times, so it could be worth trying to make it even faster. Of course, if you can take some item and Benchmark two solutions for it and pick the faster one, that will be better for performance in all cases. However, the time you spend doing that might have been better spent working on something else. That's what profiling will tell you—where your program actually spends its time in a real run. If you're writing a module, you can't always tell how it will be used. Otherwise, test your program to see where it really spends its time, and optimize those places where it will make the biggest difference. There's no reason to guess about it if you can test it easily. As for your concern about approximations and calls within calls, the profiling tools can give you meaningful statistics either way. You can look at time spent only in a certain function or in the full cumulative time spent in that function and everything it calls.	[reply]
Re: Does one encapsulate a class from itself? by chromatic (Archbishop) on Mar 06, 2008 at 18:05 UTC
On the other hand, I've got many, MANY operations to do, and adding a few million subroutine calls slows things down significantly (but not insufferably). Did you measure it?	[reply]
Re^2: Does one encapsulate a class from itself? by amarquis (Curate) on Mar 06, 2008 at 19:39 UTC
Yes. I do not remember the exact results, but I expected a noticeable difference since each get/set does data validation (since my accessors are part of the external interface, they have to make sure everything coming in is kosher. The module itself already knows it has "good" data and can just push it into the array, or update a cell). I could fix this by having internal only accessors, like ikegami mentions. I'll try it and see if the difference is measurable.	[reply]
Re: Does one encapsulate a class from itself? by guaguanco (Acolyte) on Mar 07, 2008 at 00:22 UTC
There's a big difference between a public API (which may do lots of argument checking and other validation) and an internal API. If your class is complex, it's perfectly reasonable to provide some internal API for changing the state of the object. This allows you to change your underlying implementation without breaking as much dependent code. But (as you have observered) the public API may be too cumbersome or slow for the internal implementation to use. So it All Depends - there's no single right answer.	[reply]
Re: Does one encapsulate a class from itself? by dynamo (Chaplain) on Mar 07, 2008 at 00:20 UTC
You can (probably) have it both ways. 1 - Write and make available your getters/setters, as usual. 2 - Identify places where it's going to have an actual performance cost to use them in internal methods, and write _extra_ getters / setters / modifiers that can be called from internal methods with only one function call of overhead. These other methods can do batch processing on whatever data you have, just pass them all relevant info to do that process (if it's not already embedded in $self.) Be careful about letting your getters return references, unless you really mean to do it - return copies to preserve opaqueness. 3 - consider making your new methods part of the public api. - d	[reply]
Re: Does one encapsulate a class from itself? by toma (Vicar) on Mar 07, 2008 at 08:14 UTC
The important thing is to get used to building object-oriented modules, and make it part of your normal coding style. Build them in any way that seems easy and natural. If you need performance, optimize. I tried several alternatives, and then settled on a blessed hash for most of my objects. I don't use the getter/setter functions internally, except when I think the object data access needs a more future-proof implementation. For your matrix example, I would be tempted to access the data even more directly, bypassing the object reference altogether. `{ my $data= $self->{dataref}; foreach my $x (0..$max_x) { $data->{$x}{$y} = $new_value; } }` [download] I use the smallest scope possible for the lexical variables that refer to chunks of my object. These lexical variables have descriptive names, and for me the code is easier to read. This approach reduces the length and complexity of my mathematical and string expressions, especially for setter functions. I haven't seem much of a performance difference when I use the lexicals to refer to a sub-object, but I have sometimes had speed problems with getter and setter function calls. There are good arguments for both ways, though. Since it depends on what you are doing, do whatever seems best, and it will be fine. If it isn't, change it! It should work perfectly the first time! - toma	[reply] [d/l]


Just another Perl shrine
	PerlMonks