mascip has asked for the wisdom of the Perl Monks concerning the following question:

Edit : forget about this thread, i mostly just got very confused and lost in vocabulary, just answering to myself as i was reading and discovering new concepts.

Hello all,
i'm simultaneously reading this book about TDD : "Growing Object-Oriented Software, Guided by Test", and making a small program. And i have a question about design and TDD methodology.
My aim here is mainly to learn TDD, i'm not in a rush, so i can try different designs and learn.

I am trying to respect the "single responsibility principle". As stated in the book, it might sometimes lead to more lines of code and more complicated structures, but it is more maintainable, as you easily find what you are looking for (separation of concerns), and you never need to cut through unrelated functionalities. Which makes for more modulable, adaptable programs.
You end up with more classes, that you can combine easily to create more powerful lower-level objects. The power is not purely in the objects anymore, but in their relationships: how they are assembled together.

~ ~ ~
My program must process hundreds of spreadsheets, each containing measurements, from several experiments. And do complex manipulations and calculations on these data, experiment by experiment.

I started this project by writing an end-to end test for a simply feature : reading 3 lines of two of the spreadsheets, and doing a very simple calculation on it. with the calc_simple_stuff_for_experiment($experiment_directory) subroutine.

After a few minutes the program compiles, and my next task is to implement calc_simple_stuff_for_experiment(). There are three tasks in this subroutine :
1. read the two spreadsheets (into two Data::Measurement objects for example). I will create a Reader::OfMy::MeasurementSheets object for this.
2. gather their data in a common structure : as the measurements are not done exactly at the same time, i need to "merge" them on a common time axis (into a Data::Experiment object for example).
3. do my calculations, for each time period (this is the easy bit).

~ ~ ~
I imagined two ways to implement this simple feature, with strong difference in the designs. They almost seem "opposite", or "symmetrical". I would like your opinions on these two designs.

* Idea 1 (my first intuitive idea) :
I create a Data::Experiment object. This object will have two Data::Measurement attributes. In the constructor (or in a subroutine, just to keep a simple constructor), it "time-merges" information from the two Data::Measurement, into a common structure.
Then i can just do my calculations from this Data::Experiment object.
Or even create a Calculator::ForMyExperiment object, which will take the Data::Experiment object as an attribute, and make calculations on it.

* Idea 2 ("From the book") :
If i try to follow what they do in the book (and which i might have misunderstood), then i start from the input, by creating the Data::Measurement class, that will contain the data that was read from the spreadsheets. (and even the Reader class, before that)
Then i need to delegate the responsibility of merging the data on the time-axis, and to do this i follow the method used all along this book (have i misunderstood? It was not clearly stated, but it seemed to me that they are always using this same methodology, every time the words "delegating" and "responsibility" appeared) :
- i create a Role to represent the responsibility : Merges::Measurement::Data::On::TimeAxis
- then i create a class to consume this Role : Data::Experiment. - and i "plug" this new class to Data::Measurement, by giving a Data::Experiment attribute to the Data::Measurement objects, and passing it all the necessary information to fulfill its task : a reference to the data hashes.

Which would mean that i would have to create a Data::Experiment object in my script, and give its reference as an attribute to the two Data::Measurement objects.

Then, same thing for the Calculate::Simple::Stuff role: it would have to be consumed by the Calculator::ForMyExperiment class, which would be "plugged" as an attribute to the Data::Experiment class.

So in this second design, the relationships are reversed : Data::Measurement has a Data::Experiment attribute, which has a Calculator::ForMyExperiment attribute. Exactly the opposite/symmetrical of the first proposed design.

Right now, to me, the first design still feels more "healthy" than the second one. Simply because for me, intuitively, an Experiment "has" Measurements, rather than the opposite. And a Calculator "has" Data rather than the opposite. This is how i was taught attributes.

~ ~ ~
Writing these lines, I imagined a very naive explanation : maybe that
in their case (for the example of this book), they delegate the responsibility to an attribute, because the information that is transferred consists only in orders : "do this, do that".
In my case, i need to do the opposite (delegate the responsibility to the "parent") because what i am transferring is data. The parent already knows what it must do, i am not giving it any order.

So...maybe i am seeing this problem from the wrong angle. Maybe that what they call "input" in this book (when they say "Develop from the Inputs to the Output") is an input of orders, not an input of data.
But this explanation doesn't feel right either. In the book, (p113 for example), they start "from the data", and start by creating a "auction message translator", which will then "delegate" the responsibility of responding the message, by putting the next class as an attribute to the "auction message translator". So, they end up with an "auction message translator", which "has" an "auction sniper". It seems to be the exact opposite of what i was taught at school.

I am confused, confusled, discombobulated. What do you think ? Is this second design good too? Better than the first one ? Are they both valid, simply deriving from different ways of viewing responsibility delegation ?

~ ~ ~ * EDIT : *

PS : for those who haven't read the book, i found this example (in Java, but it's fairly simple) here (http://www.methodsandtools.com/archive/archive.php?id=90) to illustrate the method from this book, for delegating a responsibility :

" The concept of SingularTask represents a singular focused activity that is delegated to a Worker. If a SingularTask is started, it will ask someone or something to work on the WorkItem. Do we need a Person for this? Possibly, but let's focus on the role instead and call it Worker. The corresponding test is:

@Test public void startShouldMakeWorkerWorkOnWorkItem(){ Worker worker = mock(Worker.class); WorkItem workItem = mock(WorkItem.class); SingularTask task = new SingularTask(worker); task.start(workItem); verify(worker).workOn(workItem); }
We specified that the worker should receive the message workOn(workItem) when the task is started. The corresponding implementation is:
public class SingularTask implements Task { private Worker worker; ... public void start(WorkItem workItem) { worker.workOn(workItem); } ... }
"

So, like i said before, with this methodology, for the Task to delegate its work, i create a Worker role (i would have called it Process::Work, or Do::Work), which will do the work (on the item). And put it as an attribute to the Task.
I will read the whole article tomorrow, when it's not the middle of the night.

~ ~ ~ * EDIT bis (PPS) : *

Finally i read a bit more. If i understood well, both designs would be "valid".
The first one correspond to a "bottom-up" build, where you first build the core (low level components) of your application. The second design correspond to a "top-down" approach, where we first consider "high level" classes, which will deal with "the real thing", and later build the core of the application.

There seem to be lots of discussions out there, about which method would be better than the other. Maybe that different methodologies suit different people. I'll try a bit of "top-down" for a while, to get the feeling of it. I think that when i code "bottom-up", with my low experience, i tend to code a lot too many things, features that i don't need later. "top-down" might help me become more "realistic".

Am i getting this right ?

And sorry for a very long message. Writing me helped me understand my questions, and hopefully there will be some interesting feedbacks and-or discussions.

Replies are listed 'Best First'.
Re: Test Driven Development example + design question
by moritz (Cardinal) on Jul 01, 2012 at 10:28 UTC

    I haven't read the book that you read, so might advice might differ a bit from what the book said.

    As another disclaimer, Perl is much more expressive than Java, so we can make some APIs simpler, and still have flexibility under the hood.

    So, what do you know about the problem? It involves reading of some files, computations, and writing of files. What's the simplest possible API for reading files?

    my $data = read_file($filename); # or maybe the reader needs to know the file format too? my $data = read_file($filename, $fileformat);

    Maybe you'll object "but that's not object oriented!". So what? Your job is to get stuff done. You can still do some OO stuff behind the scence if you really want, but for now you should focus on making the APIs as simple as possible. Oh, and you probably return some kind of object from read_file, so there's your OO part.

    Anyway, now that you have a simple API for a part of your program, you can start to write tests.

    use 5.010; use Test tests => 2; use YourModule qw/read_file/; my $data = read_file('t/data/ExampleData'); # now you have to ask yourself, what kind of data # will you need for processing the spreadsheets? # I know nothing about your application, so # I'm just making up some stuff here: is join(',', $data->headings), 'time,pressure,temperature', 'can get column headings from $data'; is $data->temperature( time => "0.01"), 273.018, 'can retrieve temperature by time';

    So, now you need to implement it far enough to get the tests passing.

    Next you can do the same for processing the data, and for writing a file. The API of each of those can always be a single subroutine call.

    If you need some more flexiblity later on, those subroutines can call some constructors on your classes and use polymorphism and whatever buzzword you want to be compatible with. But always remember that the user should seee as little complexity as possible. That way the tests also see very little complexity, and it becomes very easy to test each piece of functionality in isolation from the others.

    Once you have all of this in place, you will probably want some higher level abstraction, for example something that reads several input files, passes them to a routine that does the processing, and then writes output files. Again the API for this can be a simple function, and thus it can be easy to call and test.

    read_process_write( inputfiles => [ 't/data/TestInput1', 't/data/TestInput2'], merger => sub { ... }, outputfile => 't/data/temp/TestResult', );

    Again, write the tests for it and implement it. Then be happy.

    Note that all this time I didn't talk about how to structure roles and classes. That's because it's not really central to the working of your program, and it is often obvious. Once you know that an API is my $data = read_file($path);, it is obvious that $data must contain an object that knows everything interesting about the data in file $path (or contains other objects that know it).

      Thank you for you two answers.

      You are both saying : "your program is simple, just do things one after the other, you probably don't need objects." (and giving me examples of how you use TDD, thank you)

      Practically I agree, BUT
      my question is more theoretical than practical.
      I am trying to understand key concepts on a (too) simple example. And i think in a way, i finally understood that my question is "bottom-up or top-down design?". Could anybody confirm this? This is all new vocabulary to me, so i would appreciate if someone experienced told me whether i am understanding the concepts properly or not.

      When i will build a bigger, more complex program, i will have the same question again : should i first start building low-level objects (bottom-up approach), thus basing my design on assumptions about external inputs? Or shall i go "outside-in", first interacting with my external peers, and then delegating responsibilities to new lower-level classes when i see that i need to.

      The programer who wrote this article prefers a top-down approach , because he doesn't like to base a design on assumptions, and thus write useless code.
      While this one prefers bottom-down, because it (would) result in more loosely-coupled elements of reusable code; and it requires a lot less stubs or mocks while in development.

      These questions are important to me. Sure, i don't want to become a "purist", and to make my life complicated on simple problems all the time. But i am just trying to learn how to design highly maintainable, loosely-coupled code. And if there are several ways to do it, i'd rather learn about them.

      ~ ~ ~
      I don't find it today, but yesterday i read this blog entry by chromatic, saying that he likes to use the smallest possible objects and roles as possible. And i read it in a few other places too. I guess it sounds as if it was "more work", and people will say "why are you being so complicated?" But it also gives maintainability and flexibility. And i don't think that "more classes = more work", as creating classes is so simple with Moose.

      PS : thank you for pointing at Test::XT

        I found the blog entry. chromatic wrote this as a comment :
        "I wish Perl 5 had classes with less inertia. It wouldn't have to go as far as Smalltalk, but if I felt classes were lighter and more agile, I'd use a lot more little classes.
        Am I the only one?"

      I read a bit more.
      Apparently i misderstood the vocabulary :
      "Coding by Intention you write your code top-down instead of bottom up. Instead of thinking, "I'm going to need this class with these methods," you just write the code that you want... before the class you need actually exists. "

      So, i can do TDD and code top-down, for any of my two designs. And i still don't know what differentiates these two designs.

      I'm lost, help

      I'm getting really confused, not sure whether the second design idea is just stupid.

      I think that i understood "from inputs to outputs" the wrong way around. Inputs are events that trigger a functionality, not just data.

      So, my Input here is the user (me) requesting a calculation. I need design 1.

      Sorry. I will experiment a bit more before asking further questions. And read a book about TDD in Perl.

Re: Test Driven Development example + design question
by Khen1950fx (Canon) on Jul 01, 2012 at 06:39 UTC

    I have a different view of TDD. First, use a bare minimum of lines and keep your structures simple at first because you want a minimal skeleton of a test to "fail": Remember to fail first, then write a minimal fix. Take it one small step at a time.

    I usually start by writing POD. Put your ideas down in words. Then I use Test::XT. For example, I wrote some POD using some of your ideas:
    package Calculator::ForMyExperiment use strict; use warnings; =head1 NAME Calculator::ForMyExperiment - calculate simple stuff =head1 VERSION version 0.01 =cut our $VERSION = '0.01'; =head1 SYNOPSIS use Calculator::ForMyExperiment my $data = Calculator::FroMyExperiment->new; my $doit = calculate_simple_stuff($experiment); =head1 DESCRIPTION This module has a function for calculating stuff. =head1 EXPORTS The calculate_simple_stuff function is exported by default. =cut require Exporter; require DynaLoader; our (@ISA) = qw( Exporter DynaLoader); our (@EXPORT_OK) = qw( calculate_simple_stuff ); bootstrap Calculator::ForMyExperiment $VERSION; =head1 FUNCTIONS =head2 calculate_simple_stuff Merges simple stuff. =cut sub new { my $class = "Calculator::ForMyExperiment"; my %params = @_; my $self = {}; bless $self, $class; return $self; } sub calculate_simple_stuff { my($stuff) = qw/one two three/; foreach $stuff( my @stuffing ) { $stuff = shift(); print $stuff; } 1;
    Now use Test::XT to write a test for pod syntax:
    #!/usr/bin/perl use strict; use warnings; use Test::XT 'WriteXT'; my $test = Test::XT->new( test => 'all_pod_files_ok', release => 0, comment => 'Test that the syntax of the POD documentation is valid +', modules => { 'Pod::Simple' => 0, 'Test::Pod' => 0, }, default => 't/pod.t', ); $test->module( 'Spreadsheet::WriteExcel' ); $test->test( 'all_pod_coverage_ok' ); print "Test script: "; print $test->write_string;
    The resulting output:
    #!/usr/bin/perl -l # Test that the syntax of the POD documentaion is valid use strict; BEGIN { $| = 1; $^W = 1; } my @MODULES = ( 'Pod::Simple' => 0, 'Test::Pod' => 0, 'Spreadsheet::WriteExcel" => 0, ); # Don't run tests during end-user installs use Test::More; plan( skip_all => 'Author tests not required for installation' ) unless ( $ENV{RELEASE_TESTING} or $ENV{AUTOMATED_TESTING} ); # Load the testing modules foreach my $MODULE ( @MODULES ) { eval "use $MODULE"; if ( $@ ) { $ENV{RELEASE_TESTING} ? die( "Failed to load required release-testing module $MODULE" ) : plan( skip_all => "$MODULE not available for testing" ); } } all_pod_coverage_ok(); 1;
    Now you're good to go!
Re: Test Driven Development example + design question
by chromatic (Archbishop) on Jul 01, 2012 at 21:58 UTC

    I've not found that the approach matters as much as:

    • Are you working in small steps?
    • Are you verifying your design and implementation with coherent and well-designed tests?
    • Are you refactoring rigorously as you discover a natural design for your system?

    In general I prefer the bottom-up approach as it requires less scaffolding to build a comprehensive test suite, but neither approach is sufficiently better over the other as to matter, in my experience.

      I like this answer, thank you !

      Theory is good, interesting, important. But now i need to practice and assimilate.
      I will try this way of "delegating responsibilities", through an attribute that consumes the Role. I might take this little project as an experiment, finish it with one design, and then try to rebuild it with another design. Nice Lego game =o)

      Finally i think i understood what bottom-up or top-down means. "Bottom up" is building little subroutines (or classes) that we might use, and then build more complex subroutines (or classes) from them. "Top down" is writing the final script (or the most complex subroutine/class) first, and from there we know which "smaller elements" to build, to make it work.
      I'm ashamed it took me so long to understand :D

Re: Test Driven Development example + design question
by Anonymous Monk on Jul 02, 2012 at 14:59 UTC
    I do not listen to purists or academics who use words like, "valid." I observe the existence of familiar patterns of software-design but do not begin my work by "selecting a design pattern." I want to design code such that it can be tested throughout its construction, but I don't look to anyone's textbook or seminar to validate what I know they're referring to when they say, "test-driven development."

    Frankly, what you are saying is complicated. So complicated that I don't understand it, and so complicated that I daresay you don't understand it either. Complexity is your mortal enemy: complexity of your code, yes, but also complexity of your thoughts. You have a target over there; one of several. You have to choose the best one, locate it precisely, and hit it square. That's it. That's all. It seems to me that you have "thinked yourself" into immobility.

      "Thinked myself into immobility"
      I agree.

      "I observe the existence of familiar patterns of software-design but do not begin my work by "selecting a design pattern." "
      I agree too, this is how things should be done. But it's also very teachful to me, to understand other peolpe's experiences and concepts.

      Let me illustrate this with music: the important is to play something that you like, and you don't need to learn any theory to do that. But learning about music theory will transform the way that you listen to music, the way that you play it, and give you an accurate vocabulary to conceptualize it and speak about it. It will also enable you to play with other people more easily, and to share your experience with them with more accurate words. They are not the only "valid" words, nor is tonal music theory the only valid musical theory. But they are used enough so that people who use them can communicate accurately. And listen more accurately, in a way. Identifying and naming patterns trains the hear. And the fingers.

      I guess, there is not "one way" to do things properly. But it's teachful to try and follow a method rigorously and with dedication. To then develop you own ways around it, change it, etc.

      I used to never want to look at a programming book "i can figure it all out by myself". Now, maybe i'm a bit too much into books. But it's interesting !

      Hopefully, through programming more and reading less, i will stop "thinking myself into immobility". But this thinking process feels very rewarding. Even if i get stuck into immobility here and there (^c^).

      My problem here was not about TDD but about design, so i ordered a book about design. It will not help me "pick the right design among the valid ones", but rather "make me travel" through possible designs, think about them. It feels like good training. But it involves lots of reading and "being stuck in concepts", at first.

      I guess my message feels defensive. It guess it is, in a way. But I agree with what you are saying.

      PS : different people, different ways of learning