Re^6: &1 is no faster than %2 when checking for oddness. (Careful what you benchmark)

Unsurprising. When the bodies of the iterator loops is doing next to nothing, or actually nothing, when Benchmark does it initial timings of them in order to calculate the number of iterations to run it for, it attempts to subtract a small amont to account for the overhead of the loop itself, with the result that the calculation are probably being subjected to rounding errors.

Bingo. Exactly the reason I don't use Benchmark.

When it takes 84 million iterations of a test to accumulate 1 second of cpu on a modern processor, it certainly indicates that something is wrong with your benchmark.

I wanted to show an example where a recent version of Benchmark still produces negative numbers. Knowing that the chance of finding a benchmark producing negative numbers is higher on tests that don't take much time, I picked such a test.

This is why I tend to incorprate for loops within the test when benchmarking very small pieces of code, rather than relying on the benchmark iteration count.

This is why I don't bother with Benchmark at all; if I'm going to write my own loops, I don't need a module to subtract two time stamps for me.

Comment on Re^6: &1 is no faster than %2 when checking for oddness. (Careful what you benchmark)

Replies are listed 'Best First'.
Re^7: &1 is no faster than %2 when checking for oddness. (Careful what you benchmark) by BrowserUk (Patriarch) on Nov 20, 2006 at 10:40 UTC
Fair enough. Personally I like the math that Benchmark`::cmpthese()` does for me. More to the point. Even attempting to time operations that take so little time that Benchmark's internal math is subject to rounding errors, is mostly pointless. More so if you only time one (or 10 or 100) occurance(s) of that operation. With operations that require 84 million iterations to accumulate 1 second of cpu--that's 0.000000012 seconds per!--you're not timing the operation. You're timing the time it takes to get two successive TOD values from the OS! Your numbers will vary widely depending upon whether a task switch occured inside your timing window. So widely that your results will be meaningless. The only way to derive any meaning from comparisons of such low cost operations, is to do them 1000s of times and time the entire loop and then divide (hence my expectation of your math). Sure, that means the overhead of the loop is measured also, but if the same (and cheapest) loop mechanism is use for all the tests, then the same overhead will be in all timings. Whilst this renders the absolute values `(end - start)` completely useless for comparison purposes, the relative timings-- `((end1 - start1)/n1) / ((end2 - start2)/n2)` [download] is a useful function for comparisons. Not only does this minimise the overhead of the loop, it also minimises the differences between your cpu performance and mine; your OS and mine. Hence the relative performance ratios (percentages) that result are useful, whereas absolute numbers--of wall time; cpu time; or iterations counts in a given period--are completely useless. And guess what. These are exactly the figures that Benchmark`::cmpthese()` produces for you! And this is why I use Benchmark (and advocate the use of `cmpthese()` over `timethese`). Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error. Lingua non convalesco, consenesco et abolesco. -- Rule 1 has a caveat! -- Who broke the cabal? "Science is about questioning the status quo. Questioning authority". In the absence of evidence, opinion is indistinguishable from prejudice.	[reply] [d/l] [select]

Replies are listed 'Best First'.

Re^7: &1 is no faster than %2 when checking for oddness. (Careful what you benchmark)
by BrowserUk (Patriarch) on Nov 20, 2006 at 10:40 UTC

Fair enough. Personally I like the math that Benchmark::cmpthese() does for me.

More to the point. Even attempting to time operations that take so little time that Benchmark's internal math is subject to rounding errors, is mostly pointless. More so if you only time one (or 10 or 100) occurance(s) of that operation.

With operations that require 84 million iterations to accumulate 1 second of cpu--that's 0.000000012 seconds per!--you're not timing the operation. You're timing the time it takes to get two successive TOD values from the OS!

Your numbers will vary widely depending upon whether a task switch occured inside your timing window. So widely that your results will be meaningless.

The only way to derive any meaning from comparisons of such low cost operations, is to do them 1000s of times and time the entire loop and then divide (hence my expectation of your math). Sure, that means the overhead of the loop is measured also, but if the same (and cheapest) loop mechanism is use for all the tests, then the same overhead will be in all timings.

Whilst this renders the absolute values (end - start) completely useless for comparison purposes, the relative timings--

((end1 - start1)/n1) / ((end2 - start2)/n2)
[download]

is a useful function for comparisons. Not only does this minimise the overhead of the loop, it also minimises the differences between your cpu performance and mine; your OS and mine. Hence the relative performance ratios (percentages) that result are useful, whereas absolute numbers--of wall time; cpu time; or iterations counts in a given period--are completely useless.

And guess what. These are exactly the figures that Benchmark::cmpthese() produces for you! And this is why I use Benchmark (and advocate the use of cmpthese() over timethese).

Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.

Lingua non convalesco, consenesco et abolesco. -- Rule 1 has a caveat! -- Who broke the cabal?

"Science is about questioning the status quo. Questioning authority".

In the absence of evidence, opinion is indistinguishable from prejudice.

[reply]
[d/l]
[select]