in reply to Re^3: Self-testing modules
in thread Self-testing modules

I don't use the Test::* modules as they tell me what passed rather than what failed. I also have very definite ideas about the form that unit testing should take and that does not fit well with the pattern of a zillion ok()/not_ok() tests that those modules encourage.

I've been in a QA a position for only a few months (though I've been coding Perl for years) and am intersted in new perspectives on testing software. Would you care to tell me what modules you do use in testing and expand more on your perspective on testing?

--DrWhy

"If God had meant for us to think for ourselves he would have given us brains. Oh, wait..."

Replies are listed 'Best First'.
Re^5: Self-testing modules
by BrowserUk (Patriarch) on Jul 28, 2005 at 04:00 UTC

    See Test::LectroTest for the only cpan test module (I consider) worthy of the Test:: prefix.

    Simplified rational:

    Most bug arise as a result of the programmer making assumptions. If the same programmer writes the tests for the code s/he wrote, they will make the same assumptions. The net result is they write tests for every case they considered when writing the code, which all pass--giving them N of N tests (100%) passed and a hugely false sense of security.

    The cases they fail to tests for, are the same cases they failed to consider when writing the code, and those are normally the same cases that crop up immediately they demo it or put it into production.

    With anything other than the most trivial of functions, hoping to test all possible combinations of inputs and verify those outputs is forlorn. Ie. impossible for any practical sense of the term.

    Therefore, the only way to test code is to test is complience against a (rigourous) specification, and derive security through statistics. Ideally, this would go one step further than LectroTest and retain a record of failing values and these would be reused (along with a new batch of randomly generated ones) at each subsequent test cycle. (IMO) this is the only way that testing will be lifted out of it's finger-to-the-wind, guesswork state and move into something approaching a science.

    LectroTest isn't perfect (yet). It has fallen into the trap of becoming "expectation complient" in as much as it plays the Test::Harness game of supplying lots of warm-fuzzies in the form of ok()s, and perpectuating the anomoly of reporting 99.73% passed instead of 0.27 "failed", or better still:

    ***FAIL*** Line nnn of xxxxxx.pl failed running function( 1, 2, 3 ); Testing halted.

    Preferably dropping the programmer into the debugger at the point of failure. Even more preferably, in such a way that the program can be back-stepped to the point of invocation and single stepped through the code with the failing parameters in-place so that the failure can be followed.


    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    Lingua non convalesco, consenesco et abolesco. -- Rule 1 has a caveat! -- Who broke the cabal?
    "Science is about questioning the status quo. Questioning authority".
    The "good enough" maybe good enough for the now, and perfection maybe unobtainable, but that should not preclude us from striving for perfection, when time, circumstance or desire allow.
      Most bug arise as a result of the programmer making assumptions. If the same programmer writes the tests for the code s/he wrote, they will make the same assumptions. The net result is they write tests for every case they considered when writing the code, which all pass--giving them N of N tests (100%) passed and a hugely false sense of security.

      I find that this does not happen if you're using TDD. When you only write code by producing a failing test you are forced to challenge the assumptions in your code at every stage. Every time you make something work the next stage is "how do I break this".

      Therefore, the only way to test code is to test is complience against a (rigourous) specification, and derive security through statistics.

      As you can probably guess I don't agree with with the "therefore" and "only" :-)

      Specification based testing is a great tool, but it's certainly not the be-all and end-all of testing. It brings it's own set of good and bad points to the table, and is still affected by bad developer assumptions about the code. They're just assumptions of a different kind.

      There's a whole bunch of different ways to go about testing. From specification based tests like Test::LectroTest, xUnit frameworks like Test::Class, procedural tests like the basic uses of Test::More and friends, data driven tests like Test::Base, exploratory testing, integration testing frameworks like FIT, etc.

      Take a look at Lessons Learned in Software Testing for a great book on the multitude of useful approaches to testing.

      Not to mention practices like Test Driven Development and Design By Contract.

      Picking the best tool for the work at hand is part of job.

      LectroTest isn't perfect (yet). It has fallen into the trap of becoming "expectation complient" in as much as it plays the Test::Harness game of supplying lots of warm-fuzzies in the form of ok()s, and perpectuating the anomoly of reporting 99.73% passed instead of 0.27 "failed", or better still:

      Well the nice thing about Perl is that if you don't like the test reporting you can always change it. In fact, since I spent a chunk of yesterday re-learning how to fiddle with Test::Harness::Straps...

      .... insert sound of typing here ...

      ...there you go.

      Personally, since test suites in Perl take so darn long to run, I like seeing the okays whirl past in the background since it let's me know the darn thing hasn't hung.

      I realise that anything less than 100% pass at the end means I've fucked up, warm-fuzzies or not.

      Preferably dropping the programmer into the debugger at the point of failure. Even more preferably, in such a way that the program can be back-stepped to the point of invocation and single stepped through the code with the failing parameters in-place so that the failure can be followed.

      There are already so called Omniscient Debugging tools available for Java. So I guess it's just a trivial matter of programming :-)

        Most bug arise as a result of the programmer making assumptions. If the same programmer writes the tests for the code s/he wrote, they will make the same assumptions. The net result is they write tests for every case they considered when writing the code, which all pass--giving them N of N tests (100%) passed and a hugely false sense of security.

        I find that this does not happen if you're using TDD. When you only write code by producing a failing test you are forced to challenge the assumptions in your code at every stage. Every time you make something work the next stage is "how do I break this".

        I'm still making up my mind about TDD, but for the time being I think I'm leaning towards agreeing th BrowserUk on this one. In my experience the nasty bugs come, literally, from where I least expect them, and this is crucial. There is no hope that I will somehow write a test to catch such a bug, no matter how hard I try, because the best I can do is test those aspects that I regard as potential sources of problem. And in fact, during my recent applications of TDD, some very nasty bugs have arisen despite a rigorous adherence to TDD principles. (These bugs have all become manifest after the system had "aged" a bit and attained a particular—and as it turns out ill-conditioned—state; therefore, all the simple tests that tested functions to produce expected outputs missed these "history-dependent" bugs. I'm beginning to see that the functional programming folks are on to something with their avoidance of assignment and side effects.)

        Also, I find interesting the difference between your take on TDD and that described by Kent Beck in his widely cited TDD by Example. Beck uses "test first" only as a precondition for adding functionality to his software. I.e., he says that one should not write any new code in one's application until one has written a failing test that will succeed only after the new code has been written. He makes no mention of writing tests specifically designed to make the software fail. Admittedly, one can view this sort of "stress" testing as a special case of Kent's formulation. Namely, the "functionality" one is adding is general robustness. Still I am surprised that Beck's book puts so little emphasis on this aspect of testing.

        the lowliest monk