I'm not sure what you're asking. I read the output to find out which tests, if any, are failing.
If one needs this output in radically different format (one person I know needs it in XML), then they simply write a different harness which creates the output format they need. For the moment, I'm focusing on the typical "I just ran my test suite and have a bunch of stuff on the terminal." I want to know how that 'stuff' should be formatted to be most useful to you.
| [reply] |
Failed Test Stat Wstat Total Fail List of Failed
----------------------------------------------------------------------
+---------
t/bar.t 4 1024 13 4 2 6-8
t/foo.t 1 256 10 1 5
(1 subtest UNEXPECTEDLY SUCCEEDED).
Failed 2/3 test scripts. 5/33 subtests failed.
Files=3, Tests=33, 0 wallclock secs ( 0.10 cusr + 0.01 csys = 0.11
+CPU)
Failed 2/3 test programs. 5/33 subtests failed.
- What is 'stat'? How does it help identify the failures?
- Ditto 'Wstat'?
- What does 'UNEXPECTEDLY SUCCEEDED' mean?
If a test is designed to fail, then does it get reported as a failure when it does fail? Or is that an 'EXPECTED FAILURE'?
- Which test 'UNEXPECTEDLY SUCCEEDED'?
If it's not important enough to tell me which one, why is it important enough to bother mentioning it at all?
- What is the difference between "test scripts" and "test programs"?
And if they are the same thing, why is it neccesary to give me the same information twice?
Actually, 3 times. "Files=3, Tests=33, " is just a subset of the same information above and below it.
- When was the last time anyone optimised their test scripts?
Is there any other use for that timing information?
Of course, you'll be taking my thoughts on this with a very large pinch of salt as I do not use these tools. The above are some of the minor reasons why not.
Much more important is that there are exactly two behaviours I need from a test harness.
- The default, no programming, no configuration, pull it and run it, out-of-the-box behaviour is that I point at a directory of tests and it runs them. If nothing fails, it should simply say that.
"Nothing failed" or "All tests passed".
I have no problem with a one line, in place progress indicator ("\r..."), but it should not fill my screen buffer with redundant "thats ok and that ok and thats ok" messages. I use my screen buffer to remember things I've just done: the results of compile attempts, greps etc.
Verbose output that tells me nothing useful, whilst pushing useful information off the top of my buffer is really annoying. Yes, I could redirect it to null, but then I won't see the useful stuff when something fails.
Converting 5/10 into a running percentage serves no purpose. A running percentage is only useful if it will allow me to predict how much longer the process will take. As the test harness doesn't know how many tests it will encounter up front, much less how long they will take, a percentage is just a meaningless number.
If I really want this summary information, or other verbose information, (say because the tests are being run overnight by a scheduler and I'd like to see the summary information in the morning), I have no problem adding a command line switch (say -V or -V n) to obtain that information when I need it.
- When something fails, tell me what failed and where. Eg. File and line number. (Not test number).
Preferably, it should tell me which source file/linenumber (not test file) I need to look at, but the entire architecture of the test tools just does not allow this, which is why I will continue to embed my tests in the file under test.
Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.
| [reply] [d/l] |
Thanks for the feedback. I hope the following helps.
- What is 'stat'? How does it help identify the failures?
- 'stat' is the exit code and indicates how many tests failed. However, since it doesn't report numbers in excess of 255, it's not terribly useful and I don't know that it's used.
- Ditto 'Wstat'?
- 'wstat' is the wait status of the test. I also don't know how it's used and initially I didn't provide it, but I was told emphatically on the QA list that I should, so I did.
- What does 'UNEXPECTEDLY SUCCEEDED' mean?
- This is the number of tests which were marked TODO but passed anyway.
- If a test is designed to fail, then does it get reported as a failure when it does fail? Or is that an 'EXPECTED FAILURE'?
- A test designed to fail is generally a TODO test and if it fails, it it not reported as a failure or an 'EXPECTED FAILURE'.
- Which test 'UNEXPECTEDLY SUCCEEDED'?
- Currently Test::Harness is not able to track or report which tests unexpectedly succeeded but TAPx::Harness can and does.
- If it's not important enough to tell me which one, why is it important enough to bother mentioning it at all?
- See the note to the previous question. It is important, but Test::Harness doesn't have this ability.
- What is the difference between "test scripts" and "test programs"?
- Nothing.
- And if they are the same thing, why is it neccesary to give me the same information twice?
- I don't understand this question.
- Actually, 3 times. "Files=3, Tests=33, " is just a subset of the same information above and below it.
- It's a summary report. You may find it useful or you may not. Alternate suggestions welcome :)
- When was the last time anyone optimised there test scripts? Is there any other use for that timing information?
- I sometimes use it when I'm profiling my code and make try to optimize it. The timing information often tells me if I've made a significant difference.
- Verbose output that tells me nothing useful, whilst pushing useful information off the top of my buffer is really annoying. Yes, I could redirect it to null, but then I won't see the useful stuff when something fails.
- That's a good point. I could easily make a 'quiet' mode which only reports overall success or failure. That would let you rerun the test suite to see what actually failed, if anything.
- Converting 5/10 into a running percentage serves no purpose.
- Agreed. I was just trying to mimic the behavior of Test::Harness. Others have pointed out that it doesn't help and I'll probably just drop it.
- When something fails, tell me what failed and where. Eg. File and line number. (Not test number).
- Unfortunately, TAP format does not support this. Those data are embedded in the diagnostics and there is no way to disambiguate this information from the other diagnostic information. This is a feature that is planned for TAP 2.0.
I might add that the runtests utility I've written (not yet on the CPAN but analogous to prove), allows you to specify which test harness you want to run your tests through. Thus, you can easily create a new harness to satisfy your particular needs.
| [reply] |
Regarding the last portion of your comment: the distinction you're making is close to that between a user installing a module, and a developer running tests for his or her own code.
FWIW: for case #1, CPANPLUS doesn't by default emit output to screen when installing a module. It tells you when something went wrong. A test harness could be made to know how many tests it will run, and even conceivably an estimate of how long they are expected to take relative to each other, if this information is stored when the maintainer creates the distribution.
Regarding case #2, the developer, Pugs' Test.pm does in fact produce coordinates for test cases. We use this to cross-link information in the smoke matrices with the actual test code. There's no reason that I know of why this couldn't be ported to the Perl 5 testing frameworks.
| [reply] |
File and line number makes no sense in the context of a failed test. Consider:
1: for my $key ( sort keys %tests ) {
2: is( $tests{$key}, $wooby{$key} );
3: }
If you know that the test failed on line 2, that doesn't help you much.
| [reply] [d/l] |