Recently, there has been a lot of mention of Test-Driven Development (TDD). It intrigued me, so I decided to use it when building a relatively involved script at work yesterday.

Results:

All in all, I really like it. It definitely forces the developer to design things so that they can be verified. It won't work on all situations, but I know I probably will always work on verifiable tasks. New methodology, here I come!

------
We are the carpenters and bricklayers of the Information Age.

The idea is a little like C++ templates, except not quite so brain-meltingly complicated. -- TheDamian, Exegesis 6

Please remember that I'm crufty and crochety. All opinions are purely mine and all code is untested, unless otherwise specified.

Replies are listed 'Best First'.
Re: TDD: a test drive and a confession
by Ovid (Cardinal) on Sep 17, 2003 at 15:49 UTC

    Ah, yes, one of my favorite subjects :)

    The single greatest benefit that I have gained from tests is that I am a better programmer. It's not just that I can run my tests and know that my code works; by writing my tests first, I am forced to look at things from the standpoint of interfaces and who will be using my code. I need things to be simple and intuitive. As a result, my code is more likely to be simple and intuitive. Quite frequently, after a heavy refactoring of legacy code, I find myself with a module that has a bunch of one line methods that rarely take more than one argument (other than $self). That makes me feel good.

    And I always find strange bugs that I never thought about, but I also find myself writing fewer bugs since I am focusing on the simplest thing that can get the tests to pass.

    The non-deterministic factor is what worries me. That's actually quite a debate at my current job. Currently, we create test data, put it in a database and then nod when we get the test data back out. I think that's a problem because we're testing our expections or what the data should look like rather than testing what the data actually is. The rebuttal is "we can't test it if we don't know what it returns". In other words, they're afraid of non-deterministic tests.

    There are plenty of ways to get around this. We can make raw SQL queries against a copy of the live data to verify that the code returns the correct data. We can use regular expressions to verify that the form of the data that we return is correct, or subroutines that return true or false depending upon whether or not the data is in the correct domain (e.g, day of week is 1 to 7 instead of an integer).

    Finally, you can use Test::MockObject to override the difficult interfaces. If you find this is too much overhead, you can also override interfaces directly in your code. Here's one handy method. The subroutine you test calls fetchrow_arrayref internally:

    my $results; my $test_data = [qw{foo bar baz}]; { no strict 'refs'; local $^W; # suppres redefined warnings my $arrayref_called = 0; local *{'DBI::fetchrow_arrayref'} = sub {$arraref_called++; $test_ +data }; $results = some_func($some_arg); is($arrayref_called, 1, '... and fetchrow_arrayref should be calle +d only once'); } ok($results, '... and some_func succeeded'); is_deeply($results, $test_data, '... and it should return the correct +data');

    Finally, for troublesome built-ins that can be overridden, try ex::override.

    my @rand = qw/ 0.78195951257749 0.625570958105044 0.884315045123127 0.137177303290578 0.0650888668725038 /; use ex::override rand => sub {unshift @rand => pop @rand; $rand[0]}; print rand, $/ for 1 .. 5;

    Now you have deterministic "random" numbers :)

    Cheers,
    Ovid

    New address of my CGI Course.

      As an alternative for the rand example, you could use the other function in the rand API.

      srand( 17 ); print rand, $/ for 1 .. 5;

      Since, by definition, all random numbers produced by rand are deterministic, this simplifies your test.

      Of course, this approach falls down if the implementation of rand is different on different platforms. (I don't know if Perl has one internally or just bases the code on the rand provided by the C compiler.)

      G. Wade

      Update: God, I'm such a dolt sometimes. I had originally read this thread yesterday before the other reply on this subject was there. I didn't refresh and just thought I'd point out the below in a kind of teasing way. It has no real bearing on your underlying point I realize but I thought id take a nudge at the last comment. Anyway, this subject has been covered already. Move along now, try not to laugh at the merphqling...

      Now you have deterministic "random" numbers :)

      E:\>perl -le "srand 1; print join q(,),map { int rand 100 } 1..10;" 0,56,19,80,58,47,35,89,82,74 E:\>perl -le "srand 1; print join q(,),map { int rand 100 } 1..10;" 0,56,19,80,58,47,35,89,82,74 E:\>perl -le "srand 1; print join q(,),map { int rand 100 } 1..10;" 0,56,19,80,58,47,35,89,82,74 E:\>perl -le "srand 1; print join q(,),map { int rand 100 } 1..10;" 0,56,19,80,58,47,35,89,82,74

      Digital computers don't make non-deterministic "random" numbers... :-)


      ---
      demerphq

      <Elian> And I do take a kind of perverse pleasure in having an OO assembly language...
Re: TDD: a test drive and a confession
by dws (Chancellor) on Sep 17, 2003 at 15:25 UTC
    ... for any non-trivial test, you're going to have (potentially) non-deterministic output.

    I'm wondering if this is what you meant to say. Unit tests are the most effective when they're 100% deterministic. Getting there usually requires using predetermined test data, and some careful use of mock objects to return deterministic values.

    Taken to extreme lengths, this can involve switching in a mock implementation of time() that's under the control of test code, so that you can control the advancement of time (figuratively, of course).

    I had to put my design on paper. This may be the most important aspect of TDD.

    This may be one of those YMMV (Your Mileage May Vary) things. I've found no difference in the need to have a couple of good drawings of the system architecture (as it evolves) between doing TDD and not doing TDD. With TDD, you're forced to confront interfaces early, and you're encouraged to not build more than you need for the story at hand. This encourages lean design, which I've found to be cleaner than a "grand up-front design". On the other hand, unit tests often require using delegation (to be able to swap in mock implementations), which does add a layer to the design. Having a picture might help.

    [TDD] definitely forces the developer to design things so that they can be verified.

    Exactly.

Re: TDD: a test drive and a confession
by adrianh (Chancellor) on Sep 17, 2003 at 14:52 UTC

    Ahhhh good.... another convert :-)

    I'm a little curious you saying that:

    I had to put my design on paper. This may be the most important aspect of TDD. I can usually hold most designs in my head, but I've been working on some projects that no-one can hold in their head. (Well, maybe Abigail-II, but not most people.) I don't have the habits of putting stuff on paper to work it through. This will give me those habits.

    In my experience I get the complete opposite. Because TDD makes me move in very small steps, refactoring all of the time, I find that I don't have to keep that big design in my head - just whatever bit I happen to be working on at the time.

      I think we're talking about the same thing. I'm traditionally a holistic developer - I work on all the parts at once. With systems under 2.5k lines and under 50-ish modules, that's generally worked well. Anything over that and I start losing details. This methodology allows me to forget the details because I had to write them down to build tests for them. So, I actually have notes that make sense! :-)

      ------
      We are the carpenters and bricklayers of the Information Age.

      The idea is a little like C++ templates, except not quite so brain-meltingly complicated. -- TheDamian, Exegesis 6

      Please remember that I'm crufty and crochety. All opinions are purely mine and all code is untested, unless otherwise specified.

        I think we're talking about the same thing

        I don't think we are :-) I've come from the other direction. In the old days I did a lot of up-front design and documentation work. Now I incrementally produce the design using TDD and constant refactoring. Most of the design is in the tests - I produce few external design documents, and many of the ones I do produce are discarded as soon as the tests they help me with are written.

        This methodology allows me to forget the details because I had to write them down to build tests for them. So, I actually have notes that make sense! :-)

        Do you need to keep your notes after you have written your tests? If so, what's in the notes that's not in the tests?

        (he asks curiously :-)

Re: TDD: a test drive and a confession
by BrowserUk (Patriarch) on Sep 17, 2003 at 19:30 UTC

    Coming up with tests wasn't easy. The stupid ones that exercise your handling of Getopt::Long were simple. Getting tests that actually exercised code that deals with abstract values ...

    I've been writing test driven code (no caps) for years. It started when doing functional verification of the OS/2 apis. 80% percent of the code I wrote for the 3 1/2 years I was involved had only one purpose, to test the apis it used. Writing small sections of code and the test(s) to verify it, as a single unit of work, was the only way to do it. It's one of those things that worked so well for me that I've continued doing it (for the 10+years since) even when it wasn't a strict requirement.

    I still find it easier (and equally effective) to write the code that uses the interface, and then devise the mechanism for testing it. I feel that this means that I don't tailor my use of the inteface to fit my testing methods which I have seen happen when doing it the other way around. For any given project, half a dozen or so verification techniques start to repeat and it rapidly becomes second nature to decide which one fits the bill for a particular situation.

    This may be less necessary for Perl than C/C++ as the availability of eval means that you can do things in perl that are simply unavailable in C(++), I still think that you need more than the simple ok/nok for most things.

    This is key because for any non-trivial test, you're going to have (potentially) non-deterministic output...

    One well-used technique with functions that give contiguous ranges of output for discrete sets of input is to look for and validate the transitions across the boundaries, also called edge cases (and corners cases). The trick here is to evolve your tests from the inputs rather than the outputs. That is to say, don't look for where you think (or the code under test pre-determines) that the transitions occur, but rather

    • Start by iterating over the full ranges of each the inputs (initially in large steps) and look for transistions.
    • Gradually reduce the size of the steps for one of the inputs whilst reducing the range(s) covered by that input. In this way you can sort of binary step your way to as small a range and the minimal steps size possible to cover the transition(s) from good to bad.
    • Do this for each of the inputs in turn, returning the full range/ large steps for the previously resolved input whilst you work on the next.
    • Finally, reduce the testcase to covering only the minimum set of range(s)/ step sizes for each of the inputs required to cover the transitions.

    That probably sounds horrendously laborious, but for most functions (as in subroutines rather than the mathematical usage), even those with relatively large numbers of inputs (parameters), varying one parameters at a time over limited range(s) whilst iterating the others through the full spread in large steps means that you can usually quickly zero in on the transition points. You sometimes find that when you start varying the second input in smaller steps, that it isolates more (or moves) transitions in the first range, but doing things stepwise usually makes obvious when this is happening and how to compensate.

    The idea of reducing the coverage of testcases is an anathema in some circles, but by concentrating on an intelligently selected subset of all possible tests, you can increase the effectiveness of the testing whilst reducing the test time. The shorter the test time, the more frequently it is likely to be run.

    As with many things in life, more does not necessarily equal better. Sometimes more is just more, without benefit.


    Examine what is said, not who speaks.
    "Efficiency is intelligent laziness." -David Dunham
    "When I'm working on a problem, I never think about beauty. I think only how to solve the problem. But when I have finished, if the solution is not beautiful, I know it is wrong." -Richard Buckminster Fuller
    If I understand your problem, I can solve it! Of course, the same can be said for you.

      ...by concentrating on an intelligently selected subset of all possible tests, you can increase the effectiveness of the testing whilst reducing the test time. The shorter the test time, the more frequently it is likely to be run.

      To an extent I agree, but not totally. For most of the modules I've written the past year or so, I've found that I actually need tests for 3 contexts:

      1. Does it seem to work in the user's environment?
      This is the test that you want typically run in a CPAN(PLUS) environment when a user installs a module. It should be relatively throrough, but shouldn't take too long.

      2. Does it survive stress testing?
      For most linear programs not really an issue, but if you're working on threaded programs like I have the past year, you know something may appear to work under "normal" situations, but will break down when being hit with everything you got. ;-(

      3. Do all possible combinations of parameters work?
      This is the test that you have internally to make sure that whatever combinations of parameters that can exist, that something valid happens. This is especially important with combinations of command line parameters. This can run as long as you need: for one project I'm working on, this typically runs for 45 minutes: not something you would like your average user to go through.

      Most of my modules on CPAN have at least tests of category 1. Some of them have in category 2 as well. Category 3 tests I only run internally or by a user who is experiencing specific problems, not during a standard test/install sequence.

      Liz