Never use q{...} with Benchmark; use sub {...} instead. It's too easy to make a scope mistake when using uncompiled code.

Hmm. Mistakes are mistakes. Using our for "setup variables" used within the tests but declared external to it, is a simple solution to the scoping problem.

Whilst using coderefs is an effective ways of capturing external variables through closure, it has other side effects that can influence your benchmarks in much more subtle and harder to detect ways. This is especially true for the often seen 'communal' benchmark where different monks implementations of a subroutine are tested and there are parameters to the subs:

sub monk1 { my( $arg, @data ) = @_; ## do stuff return @results; } sub monk2 { my( $arg, @data ) = @_; ## do stuff return @results; } .... my @data = (....); cmpthese -3, { monk1 => sub{ my @results = monk1( @data ); }, monk2 => sub{ my @results = monk2( @data ); }, .... };

The problem here is that you are adding an extra level of subroutine call to each of the implementations under test. If the work being done inside the subroutine is substantial, then the extra overhead will be lost in the noise of the testing and have no substantial effect upon the outcome. But if the code being tested is itself not doing a great deal, then the additional overhead of the extra level of sub call can swamp the differences in the implementation and render the benchmark useless as a result.

To demonstrate this effect using the grep example, if we grep a list of 10 items using your prefered coderef form of benchmark, we get these results:

10 items Rate sub_block_short sub_list_short sub_block_short 305975/s -- -4% sub_list_short 319968/s 5% --

from which you might conclude that the list form of grep is 5% faster than the block form. However, if you perform the same test using the q[ ... ] form of benchmark you get:

10 items Rate eval_block_short eval_list_short eval_block_short 322569/s -- -1% eval_list_short 325028/s 1% --

From which you might conclude that the list form is only 1% quicker! Assuming an otherwise quiescent system, this is a rather less dramatic difference, that is probably confined to the realms of 'noise', and definitly not worth bothering with if you consider the block form clearer than the list form.

And the reason for the difference between the two tests is the overhead of the extra layer of subroutine call in the coderef form of the benchmark. That extra overhead completely obscures the actual differences we are attempting to assertain and explore. In this case, using the block form renders the benchmark useless as the construction of the benchmark itself has influenced and obscured the results we are attempting to define.

This stands out more when you campare the 4 forms together:

10 items Rate sub_blk_sh sub_lst_sh eval_blk_sh eval_lst_ +sh sub_blk_sh 305975/s -- -4% -5% - +6% sub_lst_sh 319968/s 5% -- -1% - +2% eval_blk_sh 322569/s 5% 1% -- - +1% eval_lst_sh 325028/s 6% 2% 1% +--

With the eval_block form coming out as 5% faster to the sub_list form despite that the block form has to be slower due to the construction of the extra level of scope!

That clearly demonstrates that the construction of the benchmark itself can have a fairly dramatic influence upon the results obtained.

Of course, as the number of items in the list increases, so the effect of the extra overhead becomes ammortised over an increasing large number of scopes generations and has much less influence on the outcome.

100_000 items Rate eval_blk_vl sub_blk_vl eval_lst_vl sub_lst_vl eval_blk_vl 36.4/s -- -1% -3% -4% sub_blk_vl 36.6/s 1% -- -2% -3% eval_lst_vl 37.4/s 3%* 2% -- -1% sub_list_vl 37.8/s 4% 3%* 1% --

Here we see that the differences between the eval_list & block forms, and those between the sub_list & block forms have equalled out at 3%. Less dramatic than the 5% and more significant than 1%.

However, you will also notice that here the sub forms have overtaken the eval forms. What happened to that extra overhead? If there is an extra overhead attached to the coderef form of benchmark, you would expect that the eval forms would remain faster than their equivalent coderef forms. The differences would be reduced as the overhead is amortised, but still the overhead should be present and show up.

The conclusion I draw from that is that, at these levels, the cost of constructing the extra scope required by the block form of grep is so insignificant that it simply gets lost in the noise of the benchmark; the internal inconsistancies of the Benchmark module itself; the impact of Perl's memory management; and external factors relating to system usage. As such, I long ago abandoned the list form of grep in favour of the (IMO), syntactically clearer block form as the efficiency justification of using the list form simply doesn't hold up in use.

Even for the most highly iterated, deeply nested uses of grep, the performance of the construct is much more likely to be influenced by whether I hit a graphic heavy page in my browser; or list the files in my directory; or even just wiggle my mouse about a little; than whether I have used the list or block form.

If there are any overall conclusions to be drawn, they are:

  1. Benchmarking well is hard.
  2. Drawing the right conclusions from any given benchmark is even harder.
  3. And never, ever say(*) "Never use ..." or "Always use ..." or "You should (not) do ...".

(*)Except in this case of course :)

Benchmark code and full results:

#! perl -slw use strict; use Benchmark qw[ cmpthese ]; our @short = 1 .. 10; our @medium = 1 .. 100; our @long = 1 .. 1000; our @vlong = 1 .. 100_000; print "10 items"; cmpthese -3, { eval_list_short => q[ my @results = grep $_ <= 5, @sh +ort; ], eval_block_short => q[ my @results = grep{ $_ <= 5 } @sh +ort; ], sub_list_short => sub{ my @results = grep $_ <= 5, @sh +ort; }, sub_block_short => sub{ my @results = grep{ $_ <= 5 } @sh +ort; }, }; print "\n100 items"; cmpthese -3, { eval_list_medium => q[ my @results = grep $_ <= 50, @me +dium; ], eval_block_medium => q[ my @results = grep{ $_ <= 50 } @me +dium; ], sub_list_medium => sub{ my @results = grep $_ <= 50, @me +dium; }, sub_block_medium => sub{ my @results = grep{ $_ <= 50 } @me +dium; }, }; print "\n1000 items"; cmpthese -3, { eval_list_long => q[ my @results = grep $_ <= 500, @lo +ng; ], eval_block_long => q[ my @results = grep{ $_ <= 500 } @lo +ng; ], sub_list_long => sub{ my @results = grep $_ <= 500, @lo +ng; }, sub_block_long => sub{ my @results = grep{ $_ <= 500 } @lo +ng; }, }; print "\n100_000 items"; cmpthese -3, { eval_list_vlong => q[ my @results = grep $_ <= 50000, @vlo +ng; ], eval_block_vlong => q[ my @results = grep{ $_ <= 50000 } @vlo +ng; ], sub_list_vlong => sub{ my @results = grep $_ <= 50000, @vlo +ng; }, sub_block_vlong => sub{ my @results = grep{ $_ <= 50000 } @vlo +ng; }, }; __END__ C:\test>538622 10 items Rate sub_block_short sub_list_short eval_block_sh +ort eval_list_short sub_block_short 305975/s -- -4% +-5% -6% sub_list_short 319968/s 5% -- +-1% -2% eval_block_short 322569/s 5% 1% + -- -1% eval_list_short 325028/s 6% 2% + 1% -- 100 items Rate eval_block_medium sub_block_medium eval_list +_medium sub_list_medium eval_block_medium 36772/s -- -1% + -5% -6% sub_block_medium 37156/s 1% -- + -4% -5% eval_list_medium 38546/s 5% 4% + -- -1% sub_list_medium 39104/s 6% 5% + 1% -- 1000 items Rate sub_block_long eval_block_long sub_list_long ev +al_list_long sub_block_long 3879/s -- -0% -3% + -4% eval_block_long 3889/s 0% -- -2% + -3% sub_list_long 3989/s 3% 3% -- + -1% eval_list_long 4027/s 4% 4% 1% + -- 100_000 items Rate eval_block_vlong sub_block_vlong eval_list_vlo +ng sub_list_vlong eval_block_vlong 36.4/s -- -1% - +3% -4% sub_block_vlong 36.6/s 1% -- - +2% -3% eval_list_vlong 37.4/s 3% 2% +-- -1% sub_list_vlong 37.8/s 4% 3% +1% --

Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
Lingua non convalesco, consenesco et abolesco. -- Rule 1 has a caveat! -- Who broke the cabal?
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.

In reply to Re^2: Benchmarking the block and list forms of grep by BrowserUk
in thread Benchmarking the block and list forms of grep by grinder

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.