in reply to Psychic Disconnect and Object Systems

Invariably, encouraged by many OO languages & frameworks and almost all OO teaching texts, people start out to define their objects by defining their attributes. This is, in my opinion as a result of my personal experience of using OO & of working with others OO, completely the wrong way to approach the problem.

When you write procedural code, you don't start by writing a huge block of variable declarations at the top of the program, subroutine or module. Not even in those lanaguages that require declarations to come at the top of the block. You start by writing the algorithm, and then go back and declare your variables as you need them.

The right way--I know, I know. bear with me.--is to define the methods first. Ask yourself not: what do these objects contain? But rather: what do these objects need to do?

The second mistake even very experienced OO programmers and architects seem to make repeatedly, is to subtly alter that (second above) question, and instead ask: What could these objects do? And that leads to all sorts of O'Woe.

The right way--again--is to write the code that will use the objects first. And I'm not talking about cutesy ok/nok unit tests either. I mean the actual code. Just create an instance of your object by calling the classes new() (or whatever constructor name makes sense) method with no parameters and then do what you need to do with it in the context of the calling code.

Out of this falls the methods (and sensible names) you need to call, along with many of the arguments those methods will require. In the writing of the calling code, the interface will change, change again, maybe change back again. Some methods will get broken up into two (or more). Others may be combined. You (I anyway) often find that things (values, constants etc.) that I initially thought belonged inside the the class instance, are actually application specific not class/object specific and so don't belong in then instance. And, much more rarely, vice versa.

Once I'm reasonably happy with the calling code for the class, I can then move on to writing the methods that it uses knowing not just what they should do, but how they will be called. Which makes writing them much easier. And when writing the methods, it becomes clear what attributes are required. And becomes much easier to see which attributes need to be stored in the instance data; and which can be generated on the fly. And you should never store a attribute if it can be reasonably regenerated from other attributes.

Working top down this way, means that I can concentrate on writing the code that is actually needed rather than trying to second guess what might be needed.

Only once the first draft compiles clean--and preferably can be exercised, though that is often not possible without expending huge effort trying to mock shit up around it--do I then look at the interface with a 'what if another application might want to reuse this class some day' to see if there is anything that can obviously be made more general without compromising its usability/maintainability/performance for this application too much.

An interesting side-effect of this is that you rarely end up with externally visible accessors to your attributes--which is a mighty good thing. And if you apply good logic to it, using internal accessors for attributes that have no external visible interface makes no sense at all. Which make auto-generation of accessors a complete waste of time.

In summary (IMO throughout)

Sitting down to write a new class definition before you've written and therefore understand--because you do not before--how it will be used, is fundamentally flawed. It just means you are second guessing the real world, and that leads to whatifitis.

And trying to decide what attributes are needed by an object, before you have a clear idea of both the class interface (methods); and their implementation requirements & costs; is insane.

But almost none of the OO texts I've seen teach people to work that way, which may or may not give you a clue as to how much weight you should give to my opinion :)

BTW:Read-only attributes are otherwise know as constants--and are better defined as such. Except for the rarely implemented concept of externally read-only attributes which are internally read-write. That is, an attribute that is modified internally as the object evolves; that can be read directly from external sources.

But does anyone use them? Does any OO framework support them? Mostly a getter (no setter) is defined to prevent direct access; and most times it is better to simply (re)calculate the value on demand rather than storing it.


Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.
  • Comment on Re: Psychic Disconnect and Object Systems

Replies are listed 'Best First'.
Re^2: Psychic Disconnect and Object Systems
by ikegami (Patriarch) on Apr 15, 2011 at 21:02 UTC

    Read-only attributes are otherwise know as constants--and are better defined as such.

    Attributes are per-object. Constants are per-process.

      Attributes are per-object.

      If an instance attribute is to ever have any value, then it is at best: write-once.

      And it is conceivable that a factory constructor could derive from different base classes to provide different constant attributes.

      Constants are per-process.

      Actually, Perl's constants are per-package rather than per process. And it certainly isn't inconceivable that they could be implemented to be block scoped as many of the newer pragmas are.

      And as I understand it, it would be completely possible to have per-instance constants with Perl 6 using Roles. Maybe even in Perl 5 with Perl6::Roles


      Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
      "Science is about questioning the status quo. Questioning authority".
      In the absence of evidence, opinion is indistinguishable from prejudice.

        If an instance attribute is to ever have any value, then it is at best: write-once.

        I've never heard people say constants are write-once rather than read-only, yet the same applies to them.

        Yes, I did say "read-only" attributes are «more "write-once" than "read-only"» earlier, but I now think that was a mistake.

        Actually, Perl's constants are per-package rather than per process.

        For each definition of an attribute, there is one value per object.

        For each definition of a constant, there is one value per processinterpreter.

        I have no idea what you think is per package.

Re^2: Psychic Disconnect and Object Systems
by Anonymous Monk on Apr 15, 2011 at 20:50 UTC

    A real world example of a read-only variable which can't be recalculated on demand is your car's odometer.

      A real world example of a read-only variable which can't be recalculated on demand is your car's odometer.

      Ignoring that in the real world, the DVLA might argue with that assessment. Clocking is still big business, though it is getting harder to do. :)

      That is an example of both "Except for the rarely implemented concept of externally read-only attributes which are internally read-write." and an exception to "most times".

      But even that belies the reality of (at least modern) real world odometers which can display in either miles or kilometres, but internally probably count output shaft revolutions. They might therefore be best modelled as an internal revolutions counter attribute with a pair of methods that apply an appropriate scaling factor to that attribute to produce the miles or kilometres figure.

      Why not just count in miles and convert to kilometres on demand you might say. But back in the real world the same electronics in the dash are used in vehicles with multiple different drive trains. Eg. The BMW 3 series comes with 1.6, 1.8, 2.0, 2.5, 3.0 & 3.5 litres engines. You also have low-reving diesel and high-reving petrol engine options. Whilst the gear ratios will usually account for much of the difference in engine speed to road speed, they do not fit a rear axle capable of withstanding the rigours of 300bhp to the 120bhp 1.6 models, and the final drive ratios will be different. Adding to that, different models have different size rims and tyre profiles so the odometer has also to account for different rolling radii.

      The upshot is, that they do not make a different odometers for each of these combinations of model to allow them to count directly in miles or kilometres, but instead count in some arbitrary revolutions of the input and then write different constants ("appropriate scaling factors") to ROM which are used to convert the counter to real-world measurements.

      In other words, there are no getters or setters for the actual attribute; just methods to obtain calculated derivatives of that private attribute.


      Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
      "Science is about questioning the status quo. Questioning authority".
      In the absence of evidence, opinion is indistinguishable from prejudice.
      hum? My odometer changes all the time. You wouldn't use is => 'ro' for a car's mileage cause you wouldn't be able to change it as the car moves.
        Actually, you would. You do not want car odometers with some knobs that allows the driver to change the value. The external API of a car odometer does not have a setter - the driver can look at the display and see the value; he cannot set it. (Day trip meters are an intermediate - they allow display, and they have a reset function, but still, the driver cannot set it to an arbitrary value). Now, the driver can drive the car, and that will cause the attribute to change, but that's not setting a value of an attribute using a set-accessor.
      I think a better example is the "actual time". The actual time, is a constantly changing value, which is read-only. Except for those people who are allowed to insert leap-seconds.
        I think a better example is the "actual time". The actual time, is a constantly changing value, which is read-only.

        That is no better example.

        Application code, whether Perl or C/C++ or any other language, does not have direct access to the hardware timers that underlie real-time clocks.

        Therefore, it makes no sense for an application class or object to have a "actual time" attribute. They may have an actual time method, but that will of necessity make a system call to obtain the value.

        Even if the class we were discussing was a kernel system class, an "actual time" method would still not return a direct read of a read-only attribute. Clocks are volatile hardware counters, and they do not count in program usable real-time units directly. They count in something related to bus clocks, which vary from processor to processor, and even with the current energy saving or turbo boost states.

        So, any "actual time" method is going to involve some conversion from hardware clock units to real-worlds units. In other words, it is going to be a derived value.


        Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
        "Science is about questioning the status quo. Questioning authority".
        In the absence of evidence, opinion is indistinguishable from prejudice.
Re^2: Psychic Disconnect and Object Systems
by Jenda (Abbot) on Apr 16, 2011 at 23:17 UTC

    The thing is this is the right way only for one sort of objects ... for objects that are (mostly) behaviour containers. Objects that exist to do something and happen to need some (mostly internal) state to provide you with a nice interface.

    But there is a totally different sort of objects .... objects that are (mostly) data containers. Objects that exist to store some (mostly externaly accessible) data and provide little behaviour apart from validating the values stored and ensuring the consistency of the data set.

    In this case starting with the methods would be pointless. Instead you need to know what data do you need to store, what are the types and constraints and then you may start adding methods that compute something out of the stored data, that change several attributes at once in some way, ...

    Most times it's easy to say which sort of object are you designing ... a "worker" or an "intelligent data structure", sometimes the distinction is not so clear. Anyway starting with starting with the code that uses the object is IMHO a good idea.

    Jenda
    Enoch was right!
    Enjoy the last years of Rome.

      But there is a totally different sort of objects .... objects that are (mostly) data containers.

      There is a call for this type of object, but I think they are (or rather, should be) far rarer than they tend to be in many code bases. I think this comes about because people, and the texts/courses they learn from, tend to start their design process from the bottom and work up.

      Data has no purpose in isolation of the code that manipulates it. That is, at some point in the application, there is code that instantiates and populates one or more of these "data container" objects and then operates upon the data encapsulated. That code is (should be) the methods for that data.

      But that code often doesn't lend itself--because of other data dependencies; or code structural requirements--to being pushed down into the data object class as methods, so the objects becomes data only. Devoid of methods beyond constructors, destructors and accessors. And the result is that you've created a class to hold the data and access it and nothing more. But if you lifted the data within the container to the level where it is actually manipulated, then you'd save an entire layer of pure overhead.

      That is to say, at level X of the application, instead of storing a reference to an object--a blessed hash in Perl terms; a 'blessed' struct in C++ terms--as a my/auto variable and accessing it contents through methods. You store the hash or struct in a my/auto variable at that level and access it contents directly. And you save the overhead method lookup and subroutine calls; and inconvenience of method call syntax to access the data.

      Ie. Instead of:

      { ## some block at some level my obj = Class->new( initX, initY ); obj->setZ( obj->getX() + obj->getY() ); }

      You have:

      { ## some block at some level my %hash = ( X => initX, Y => initY, Z => 0 ); $hash{Z} = $hash{X} + $hash{Y}; }

      Now someone will say, but what if you have two (or more) places in you code where you need this object. by inlining it you are promoting C&P reuse. And my response is that if these are the same (type of) object, then the manipulations (above represented by: o.z = o.x + O.y) will be the same, and so that code should be a method. The response will be that maybe in the other place, they need to be multiplied, not added. At which point, I say that they are not the same (type of) object, despite that they both have X,Y & Z components. And so, they shoudl not be merged into a single class.

      Now that example is all far too abstract and trivial to make for a good case for my position. But it does demonstrate a point that if you design from the bottom up, attributes first, you will often conflate distinct objects because they are superficially similar in their internal data structure. They both have X, Y & Z components. But of you operate top down--then you probably wouldn't encapsulate such trivial data in the first place:)--but if you did then the object in one place would have an Add() method, and the other a Multiply() method. And you would not therefore conflate the two types.

      Sure, you can probably get away with creating a hybrid class that has both methods. But then you have this complicated class that does more than either usage requires and so won't spot that actually they are two different classes each of which is actually only used in one place. And definitely won;t spot that actually, they are each so simple that they can each be lifted into their call site thereby making the code both simpler and more efficient.

      Again, Ill say that this example is too trivial to make for a convincing argument. So, I'd ask you to outline one of these "data only objects" and its use. Preferably a real world example you have kicking around. Then I could attempt to make my case more strongly; and give you the opportunity to counter based upon real world usage rather than trivial abstract examples.

      One of us might convince the other :)


      Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
      "Science is about questioning the status quo. Questioning authority".
      In the absence of evidence, opinion is indistinguishable from prejudice.
      Indeed, in C++ I often augment a "plain" struct with member functions. I wouldn't write a virtual get/set pair for each field though.
Re^2: Psychic Disconnect and Object Systems
by John M. Dlugosz (Monsignor) on Apr 18, 2011 at 01:56 UTC
    This is, in my opinion as a result of my personal experience of using OO & of working with others OO, completely the wrong way to approach the problem.
    I could not agree more. I've seen discussions on automatically generating reader and writer methods for state variables (in which languages I don't recall), and I join the chorus of those pointing out that this probably isn't a good thing to be doing. It just encourages making the state not encapsulated, and is no different t han just declaring all the fields public.

    The right way--I know, I know. bear with me.--is to define the methods first. Ask yourself not: what do these objects contain? But rather: what do these objects need to do? ... The right way--again--is to write the code that will use the objects first.
    For sure! I start with use-cases and from there describe the object user's view of the object. I've written full documentation before finalizing the underlying design of just how it manages to do all that.

    It sounds like we are on the same page. So, how would you use Moose today? Certainly I can start by writing the methods, and plan that any internal state that I need to implement them are indeed internal implementation details. So how do you declare your has's to create per-instance storage, for internal use only?

      So, how would you use Moose today?

      I probably wouldn't. Mouse maybe, but not Moose. Mostly because it carries a lot of complexity under the covers in order to provide features--like deep introspection--that I do not see the benefit of.

      And even though that complexity is hidden from me, I know it is still there and I must carry its overhead though I'll never benefit from it. And for the type of code I mostly write these days, that overhead is significant.

      So how do you declare your has's to create per-instance storage, for internal use only?

      As I understand it, if you use is bare, then you get no accessors and no warnings.

      I'm not into the bondage and discipline thing. Indeed, in part it was the complexity of the whole private/public/friend thing that caused me to arrive at a considerable distaste for C++. (There are other reasons also, but that was one.) So when I feel the need for OO in Perl, I'm perfectly happy to construct mine around blessed refs manually.

      With no attribute accessors at all--public or private. '_'s to indicate private methods are sufficient for my purposes. And initialisation done by the constructor from whatever form of input makes most sense. Eg. If the data that is used to initialise an object is read from an external source as text, then either that text gets passed to the constructor, or for mass instantiation, an open file handle.

      Declarative syntax is a nice to have, but only if it retains sufficient control to allow me to decide (and only pay for) those features I want/need.

      As I rarely write assessors for my attributes, I've never felt the burden of doing so, so don't need auto-generation. And As I only validate parameters when they transition the public/private boundary--not every time an attribute gets written--the use of declarative validation is also superfluous.


      Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
      "Science is about questioning the status quo. Questioning authority".
      In the absence of evidence, opinion is indistinguishable from prejudice.