ig has asked for the wisdom of the Perl Monks concerning the following question:

Is there a way to set initial conditions for iterations while benchmarking (with Benchmark or anything similar) such that the setup time is excluded from the benchmark?

I have

use strict; use warnings; use Benchmark; my @strings = qw(exception:tex exception:mex asdf tex:exception:mex); Benchmark::cmpthese( -5, { 'one' => sub { my @unfiltered = @strings; my @filtered = grep { /e +xception:(?!tex)/} @unfiltered; }, 'two' => sub { my @unfiltered = @strings; my @filtered = grep { /e +xception/ && !/tex/ } @unfiltered; }, 'three' => sub { my @unfiltered = @strings; my @filtered = grep { +/exception:/g && !/\Gtex/ } @unfiltered; }, });

Each sub begins with "my @unfiltered = @strings" so that they all have the same setup overhead and so that pos is not carried from one iteration to the next in the third case (see strange behavior of grep with global match [resolved]).

While each alternative has the same overhead, this overhead obscures the difference between the code under test. I imagine that in some cases the setup overhead might be so great that the code under test makes a negligible difference. So, I would like to perform setup of each iteration but have the time that this setup takes excluded from the benchmark calculation. I don't see any way to do this with Benchmark and don't know any other modules/tools that might support this.

Replies are listed 'Best First'.
Re: Initializing iterations while benchmarking
by BrowserUk (Patriarch) on Aug 07, 2009 at 10:16 UTC

    You're right to be concerned. These are the results of your orignal on my system:

    C:\test>786728-1.pl Rate three one two three 95432/s -- -13% -35% one 110191/s 15% -- -25% two 146082/s 53% 33% --

    And these excluding the setup overhead:

    C:\test>786728-2.pl Rate one two three one 6935150/s -- -0% -6% two 6965018/s 0% -- -5% three 7345874/s 6% 5% --

    As you can see, the setup swamps the code under test and skws the results horribly.

    Here's my version of the benchmark:

    use strict; use warnings; use Benchmark; our @strings = qw(exception:tex exception:mex asdf tex:exception:mex); Benchmark::cmpthese( -5, { 'one' => q[ my @filtered = grep { /exception:(?!tex)/} @unfilter +ed; ], 'two' => q[ my @filtered = grep { /exception/ && !/tex/ } @unfil +tered; ], 'three' => q[ my @filtered = grep { /exception:/g && !/\Gtex/ } @u +nfiltered; ], });

    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority".
    In the absence of evidence, opinion is indistinguishable from prejudice.

      Thanks BrowserUk but in your example @unfiltered is empty.

      use strict; use warnings; use Benchmark; our @strings = qw(exception:tex exception:mex asdf tex:exception:mex); Benchmark::cmpthese( -5, { 'one' => q[ my @filtered = grep { /exception:(?!tex)/} @strings; + ], 'two' => q[ my @filtered = grep { /exception/ && !/tex/ } @strin +gs; ], 'three' => q[ my @filtered = grep { /exception:/g && !/\Gtex/ } @s +trings; ], }); __END__ Rate one two three one 127000/s -- -21% -22% two 161678/s 27% -- -1% three 163855/s 29% 1% --

      This is still quite a bit different to the performance with the initializations included:

      use strict; use warnings; use Benchmark; our @strings = qw(exception:tex exception:mex asdf tex:exception:mex); Benchmark::cmpthese( -5, { 'one' => q[ my @unfiltered = @strings; my @filtered = grep { /ex +ception:(?!tex)/} @unfiltered; ], 'two' => q[ my @unfiltered = @strings; my @filtered = grep { /ex +ception/ && !/tex/ } @unfiltered; ], 'three' => q[ my @unfiltered = @strings; my @filtered = grep { /ex +ception:/g && !/\Gtex/ } @unfiltered; ], }); __END__ Rate three one two three 81985/s -- -14% -24% one 95390/s 16% -- -12% two 108473/s 32% 14% --

      so this illustrates that even simple initialization can have significant impact on results, and I imagine cases where it might be much more (e.g. setting up initial conditions for a database query, where the database itself must be initialized each time, and the connection to it, and caches flushed, etc.).

Re: Initializing iterations while benchmarking
by alexm (Chaplain) on Aug 07, 2009 at 10:15 UTC

    I guess you can always setup your data outside the closures, just before the benchmark takes place, as in:

    use strict; use warnings; use Benchmark; my @strings = qw(exception:tex exception:mex asdf tex:exception:mex); my @one = @strings; my @two = @strings; my @three = @strings; Benchmark::cmpthese( -5, { 'one' => sub { my @filtered = grep { /exception:(?!tex)/} @one; }, 'two' => sub { my @filtered = grep { /exception/ && !/tex/ } @two; + }, 'three' => sub { my @filtered = grep { /exception:/g && !/\Gtex/ } + @three; }, });

        Okay, but then you can reset pos in case three, as you already explained.

        sub { my @filtered = grep { pos = 0; /exception:/g && !/\Gtex/ } @thre +e; }

        Is there something wrong with this? Aside from the overhead of resetting pos, of course.

Re: Initializing iterations while benchmarking
by DStaal (Chaplain) on Aug 07, 2009 at 13:42 UTC

    As an alternate approach entirely: Devel::NYTProf will give you total time used by blocks and lines of code, so it could be used for some benchmarking.

    use strict; use warnings; my @strings = qw(exception:tex exception:mex asdf tex:exception:mex); for ( my $i; $i < 500; $i++ ) { my @unfiltered = @strings; my @filtered = grep { /exception:(?!tex)/} @unfiltered; }; my @unfiltered = @strings; my @filtered1 = grep { /exception/ && !/tex/ } @unfiltered; }; my @unfiltered = @strings; my @filtered2 = grep { /exception:/g && !/\Gtex/ } @unfiltered; }; }

    Run that with perl -d:NYTProf, and you'll be able to get output that will list how long it took on average for each grep, and the total time for each grep, having run through them 500 times. It won't quite be as neat for benchmarking as Benchmark, but it'll give you the data. (Note: Code has only been run in Perlmonks, not perl.)