in reply to Re^3: Need advice on test output
in thread Need advice on test output

File and line number makes no sense in the context of a failed test. Consider:
1: for my $key ( sort keys %tests ) { 2: is( $tests{$key}, $wooby{$key} ); 3: }
If you know that the test failed on line 2, that doesn't help you much.

xoxo,
Andy

Replies are listed 'Best First'.
Re^5: Need advice on test output
by BrowserUk (Patriarch) on Jan 06, 2007 at 01:13 UTC
    File and line number makes no sense in the context of a failed test.

    Oh contrare. I'd at least have a starting point, even in this somewhat contrived example.

    In more normal cases, about 95% of those test scripts I've looked at, that consist of long linear lists of unnumbered ok()s and nok()s, having the line number of the failing test would save me from have to to play that most rediculous of games--count the tests. Are they numbered from zero or one? Does a TODO count or not? Do tests that exists inside runtime conditional if blocks count if the runtime condition fails? If no, how can I know whether that runtime condition was true of false? Etc.

    Of course, in this case I'd need other information too. But then in this case, the test number would be of no direct benefit either. In this case I'd have to modify the .t file to print out a sorted list of the keys to %tests at runtime, as there would be no other way to work out which test related to test N.

    Oh damn! But then tracing stuff out from with in a test script is a no-no, because the test tools usurp STDOUT and STDERR for their own purposes, taking away the single most useful, and most used, debugging facility known to programmer kind: print.

    And there you have it, todays number one reason I do not use these artificial, overengineered, maniacally OO test tools. They make debugging and tracing the test script 10 times harder than doing so for the scripts they are meant to test.

    They are an unbelievably blunt instrument, who's basic purpose is to display and count the numbers of boolean yeses and nos. To do this simple function

    • they usurp the simplest and best debugging tool available.
    • force me to divorce my tests from the code under test.
    • curtail my ability to use debuggers.
    • wrap several layers of complication around the debugging process.
    • and throw away reams of useful--I would say vital--information in the process.

    And all of this so as to produce a bunch of 'pretty pictures and statistics' that I have no use for and have to use yet another layer (the test harness) to sift and filter to produce the only statistic I am interested in.

    What failed and where?

    For all the world this reminds me of those food ads and packaging that proclaim to world; "Product X is 95% fat free!". Ug. You mean that 5% of that crap is fat?

    To date, the best testing tool I've seen available is Smart::Comments. It's require, assert, ensure, insist, check, confirm, & verify methods are amazingly simple, amazingly powerful.

    • These allow me to place the tests right there in the code being tested.
    • One file including code and tests.
    • When failures occur, I get useful information, including but not limited to the file and line number where the failing test occurred.
    • They are easily and quickly enabled & disabled by the addition of a single comment card at the top of the code.
    • I can enable them on a per file basis and so only test that code I am interested in and not wait for all the tests I'm not interested in to execute first.
    • I can have multiple levels of test that allow me to use a course granularity of tests to zone in on the failure and then fine granularity to isolate the exact point of failure--in the code that is being tested, not some third party test script.
    • Most importantly of all. They allow me to take some users testcase that is causing failures in my code, turn the tracing and debugging on in my module(s), run that user script, and see the results.

      This is immediate and accurate.

      I do not have to modify the user supplied testcase in any way. And that is the holy grail of testing. Run the user script, unmodified on my system with debugging enabled within my modules only.

      And if the users testcase has a bunch of complex dependancies that I do not or cannot have, I can instruct the user to go into his copy of my modules and delete 1 character and all of my tests are enabled. He can then run his testcase in his environment and supply the output to me, and I can see exactly what went on.

      This is priceless!

    • Finally, when testing is complete, being a source filter, commenting out the use line, means that all--every single bit of the test code; the overhead; the setup; everything--gets the hell outta dodge. It is simply gone.

    Smart::Comments is the single, most useful, and most underrated module that theDamian (and possibly anyone) has yet posted to CPAN. I recognised the usefulness of the concept long ago when I came across Devel::StealthDebug, which may or may not have been the inspiration for Smart::Comments. In use, the former proved to be somewhat flaky, but theDamian has worked his usual magic with the concept (whether it was the inspiration for it or not), and come up with a real, and as yet unrecognised, winner.

    To achieve the perfect test harness,

    1. supply a patch to Smart::Comments that allows it to be enabled/disabled via an environment variable, (with the default being OFF).
    2. Also patch it so that with an appropriate setting in that environment variable, failing asserts et al, becomes non-fatal. So that they log the assertion failure (warn style, but with Carp::cluck-style traceback), and allow the code to continue (as it would in production environments).

      The information, as logged for failure, would also be logged for success in this mode of operation.

    3. Write a test harness application to parse that output and produce whatever statistics and summary information is useful.

    Why haven't I written it yet? Because I keep hoping that Perl6, oops, Perl 6 is 'just around the corner', and I'm hoping that Smart::Comments will be built-in.

    Of course, a few additional modules wouldn't go amiss. Smart::Comments::DeepCompare, Smart::Comments::LieToTheCaller and few others, but mostly it's all right there.


    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority".
    In the absence of evidence, opinion is indistinguishable from prejudice.

      To achieve the perfect test harness, 1) supply a patch to Smart::Comments that allows it to be enabled/disabled via an environment variable, (with the default being OFF). [...]

      Don't need a patch for this item. You can use if to achieve this.

      use if $ENV{USE_SMART_COMMENTS}, 'Smart::Comments';

      Also, you could omit the use Smart::Comments and do -M.

      perl -MSmart::Comments script.pl

      However, the effect is shallow when using -M. Only the file loaded directly is affected. To test a module, you'd have to execute it as a script.

      perl -MSmart::Comments Module.pm

        Thanks, I like that.

        I'd still like the ability to switch between a 'Say nothing for passes but stop at the first assert failure' development mode and a 'Report everything, passes and failures, but don't stop' run-a-user-testcase mode; also the ability to adjust the Smartness level; via the command line or environment.


        Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
        "Science is about questioning the status quo. Questioning authority".
        In the absence of evidence, opinion is indistinguishable from prejudice.
      I'd at least have a starting point, even in this somewhat contrived example.

      That's why you use comments on your tests, to describe what you're testing, and to make it easy to track down later. If you're getting bare ok 1\nok2\nok3 it's your own fault.

      xoxo,
      Andy

        But don't you see the dichotomy here? Because the test tools don't capture the line numbers, I have to add comments to allow me to get back to the line numbers.

        Not only does that create extra work, thinking up appropriate comments; typing them etc. It also create a bunch of knock on problems. For example,

        1. In the summary screens that started this thread, the only really useful bit is the concise lists of failing test numbers, but there is no easy way to relate those numbers back to the failing tests.

          This means you have to re-run the individual failing tests scripts (using that syntax that I can never recall), in order to get the full output (which in the process pushes my useful information of the top of my buffer).

          If the line numbers where available, these could be added to the summary list without any great problem.

          t/bar.t        4  1024    13    4  2 6-8

          could become:

          t/bar.t       13    4 2(27) 6(54)-8(77)

          You couldn't easily do the same with the comments.

        2. Most, if not all, programmer's editors on earth worthy of the name have the built-in ability to parse "compiler output" extract filenames and linenumbers and take you straight to the appropriate line of the appropriate file.

          If the line numbers where available in the test harness summary data as above, then I could see myself writing a short editor macro (in my fairly macro-challenged preferred editor) to run the test harness, capture & parse the summary output and use it to step through the failing tests.

          Doing something similar using the current setup would involve, running the test harness, capturing & parsing the summary screen; re-running the failing test script individually; capturing and parsing it's output; Extracting the failing test case comments (if the author has provided them!); loading the test script; searching through for the comment; and hoping that it is unique.

          I'm not saying this isn't doable in something like emacs, but it's just so much extra work that isn't guaranteed to work. Line numbers in files are unique by definition. Comments might be or they might not.

        3. It doesn't take too much thought to see other tools for which comments are a poor and messy substitute for line numbers.

        And remember, either way, all of this only gets me back to the place in the test script where the test failed. I've still got to get from there back to the code that it tests. And that could be literally anywhere. If the test that tests the code is in the same file and in rough proximity to the code being tested; and the failing test output incorporates the filename and line number; then my simple editor macro can take me straight there in one jump.

        There is simply no way to do this with the current system. The best you can do is see what apis are being called in the failing test and then grep all the source files and hope you turn up a likely looking candidate. This bad enough in a moderately complex suite of your own writing, but backing tracking in a complex test suite for code you didn't write, to the failing code is nigh impossible.

        All those extra steps and dis-contiguous paths just throw away the beauty of the edit/run loop that makes Perl (and other dynamic languages) such a delight to program.


        Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
        "Science is about questioning the status quo. Questioning authority".
        In the absence of evidence, opinion is indistinguishable from prejudice.