Re^5: Benchmark.pm: Does subroutine testing order bias results?

Sure. As you can see, it's not the lexically first test that gets the biased. It's the first iteration of that test. Which explains why the bias is more pronounced the less runs you do.

By running all the tests once and discarding the results, you even up the playing field and the seconds cmpthese shows a much better distribution.

You should also consider shutting down as much else that is running on your box for the duration of the tests. For example, if my dial connection times out during a test, a high priority thread runs for the duration of the reconnect. That can completely skew the results.

Even using the mouse to pop up the task manager will have some effect. But if this is enough to obscure the gains you have made, it probably means that they are so small as to be subject to random variation anyway.

#! perl -slw
use strict;
use Benchmark qw[ cmpthese ];

our $ITERS ||= 5;
our $REPS  ||= 10000;

sub test { 
    my @strings = map{ ' ' x 1000 } 1 .. $REPS; 
}

my %tests = ( Atest => \&test, Btest => \&test, Ctest => \&test, Dtest
+ => \&test, );

## Ignore the results produced by this run
cmpthese( 1,      \%tests);

## These should show more even distribution. 
cmpthese( $ITERS, \%tests); 

P:\test>373536-2 -ITERS=10
        Rate Dtest Btest Ctest Atest
Dtest 4.27/s    --   -0%   -7%  -67%
Btest 4.27/s    0%    --   -7%  -67%
Ctest 4.59/s    7%    7%    --  -64%
Atest 12.8/s  200%  200%  179%    --
        Rate Ctest Dtest Atest Btest
Ctest 4.10/s    --   -1%   -1%   -1%
Dtest 4.13/s    1%    --   -0%   -0%
Atest 4.13/s    1%    0%    --   -0%
Btest 4.13/s    1%    0%    0%    --
[download]

Examine what is said, not who speaks.

"Efficiency is intelligent laziness." -David Dunham
"Think for yourself!" - Abigail
"Memory, processor, disk in that order on the hardware side. Algorithm, algoritm, algorithm on the code side." - tachyon

Comment on Re^5: Benchmark.pm: Does subroutine testing order bias results? Download Code

Replies are listed 'Best First'.
Re^6: Benchmark.pm: Does subroutine testing order bias results? by jkeenan1 (Deacon) on Jul 19, 2004 at 21:30 UTC
BrowserUk: The results were exactly as you predicted. Here is a set of tests of `cmpthese()` which parallels the results I posted earlier from runs on Win2K and Darwin. I will now try to adapt this approach to my original problem. Thanks for taking the time to look at this. #!/usr/local/bin/perl use strict; use warnings; use Benchmark qw[ timethese cmpthese ]; # Usage: buk.pl iterations records die "Need 2 numeric command-line arguments: $!" unless ( @ARGV == 2 and ($ARGV[0] =~ /^\d+$/ and $ARGV[0] > 0) and ($ARGV[1] =~ /^\d+$/ and $ARGV[1] > 0) ); my ($iterations, $records) = @ARGV; print "\n# . Testing $iterations iterations of $records elements .. +.\n\n"; my %tests = ( Atest => \&test, Btest => \&test, Ctest => \&test, Dtest => \&test, ); cmpthese( 1 , \%tests); # to clear up memory per browseruk cmpthese( $iterations, \%tests); sub test { my @strings = map{ ' ' x 1000 } 1 .. $records; } __END__ # 1. # Testing BrowserUk's second version of script intended # to work around problems in &Benchmark::cmpthese # Perl 5.8.4; Darwin (Mac OS X, version 10.3) # a. Testing 5 iterations of 25000 elements ... (warning: too few iterations for a reliable count) (warning: too few iterations for a reliable count) (warning: too few iterations for a reliable count) (warning: too few iterations for a reliable count) Rate Dtest Ctest Btest Atest Dtest 1.05/s -- -2% -3% -64% Ctest 1.07/s 2% -- -2% -63% Btest 1.08/s 3% 2% -- -63% Atest 2.91/s 177% 173% 168% -- Rate Ctest Btest Atest Dtest Ctest 1.04/s -- -0% -0% -0% Btest 1.04/s 0% -- 0% -0% Atest 1.04/s 0% 0% -- -0% Dtest 1.04/s 0% 0% 0% -- # b. Testing 5 iterations of 50000 elements ... (warning: too few iterations for a reliable count) (warning: too few iterations for a reliable count) (warning: too few iterations for a reliable count) (warning: too few iterations for a reliable count) s/iter Ctest Dtest Btest Atest Ctest 5.70 -- -0% -0% -85% Dtest 5.70 0% -- -0% -85% Btest 5.69 0% 0% -- -85% Atest 0.859 564% 564% 562% -- s/iter Ctest Btest Dtest Atest Ctest 5.79 -- -0% -1% -1% Btest 5.76 0% -- -1% -1% Dtest 5.72 1% 1% -- -0% Atest 5.71 1% 1% 0% -- # c. Testing 50 iterations of 25000 elements ... (warning: too few iterations for a reliable count) (warning: too few iterations for a reliable count) (warning: too few iterations for a reliable count) (warning: too few iterations for a reliable count) Rate Dtest Ctest Btest Atest Dtest 1.05/s -- -2% -3% -64% Ctest 1.07/s 2% -- -2% -63% Btest 1.08/s 3% 2% -- -63% Atest 2.91/s 177% 173% 168% -- Rate Ctest Btest Dtest Atest Ctest 1.02/s -- -0% -1% -1% Btest 1.02/s 0% -- -1% -1% Dtest 1.03/s 1% 1% -- -1% Atest 1.04/s 1% 1% 1% -- # d. Testing 50 iterations of 50000 elements ... (warning: too few iterations for a reliable count) (warning: too few iterations for a reliable count) (warning: too few iterations for a reliable count) (warning: too few iterations for a reliable count) s/iter Dtest Ctest Btest Atest Dtest 5.70 -- -0% -0% -85% Ctest 5.70 0% -- -0% -85% Btest 5.69 0% 0% -- -85% Atest 0.859 564% 564% 562% -- s/iter Dtest Atest Ctest Btest Dtest 5.75 -- -0% -1% -1% Atest 5.73 0% -- -0% -1% Ctest 5.72 1% 0% -- -0% Btest 5.70 1% 1% 0% -- # e. Testing 100 iterations of 25000 elements ... (warning: too few iterations for a reliable count) (warning: too few iterations for a reliable count) (warning: too few iterations for a reliable count) (warning: too few iterations for a reliable count) Rate Ctest Dtest Btest Atest Ctest 1.05/s -- -0% -3% -62% Dtest 1.05/s 0% -- -3% -62% Btest 1.08/s 3% 3% -- -61% Atest 2.78/s 165% 165% 156% -- Rate Atest Btest Dtest Ctest Atest 1.03/s -- -0% -0% -1% Btest 1.03/s 0% -- -0% -1% Dtest 1.03/s 0% 0% -- -0% Ctest 1.04/s 1% 1% 0% -- # f. Testing 100 iterations of 50000 elements ... (warning: too few iterations for a reliable count) (warning: too few iterations for a reliable count) (warning: too few iterations for a reliable count) (warning: too few iterations for a reliable count) s/iter Ctest Dtest Btest Atest Ctest 5.70 -- -0% -0% -85% Dtest 5.70 0% -- -0% -85% Btest 5.70 0% 0% -- -85% Atest 0.875 552% 552% 552% -- s/iter Btest Atest Ctest Dtest Btest 5.75 -- -1% -1% -1% Atest 5.72 1% -- -0% -0% Ctest 5.72 1% 0% -- -0% Dtest 5.70 1% 0% 0% -- # 2. # Testing BrowserUk's second version of script intended # to work around problems in &Benchmark::cmpthese # Perl 5.8.0; Windows2000 Professional # a. Testing 5 iterations of 25000 elements ... (warning: too few iterations for a reliable count) (warning: too few iterations for a reliable count) (warning: too few iterations for a reliable count) (warning: too few iterations for a reliable count) Rate Atest Dtest Btest Ctest Atest 1.20/s -- -27% -28% -28% Dtest 1.64/s 36% -- -2% -2% Btest 1.67/s 38% 2% -- -0% Ctest 1.67/s 38% 2% 0% -- Rate Btest Atest Ctest Dtest Btest 1.65/s -- -0% -1% -1% Atest 1.66/s 0% -- -0% -1% Ctest 1.66/s 1% 0% -- -1% Dtest 1.67/s 1% 1% 1% -- # b. Testing 5 iterations of 50000 elements ... (warning: too few iterations for a reliable count) (warning: too few iterations for a reliable count) (warning: too few iterations for a reliable count) (warning: too few iterations for a reliable count) s/iter Atest Ctest Dtest Btest Atest 1.61 -- -25% -25% -26% Ctest 1.20 34% -- 0% -1% Dtest 1.20 34% 0% -- -1% Btest 1.19 35% 1% 1% -- s/iter Ctest Dtest Btest Atest Ctest 1.22 -- -0% -1% -1% Dtest 1.21 0% -- -0% -0% Btest 1.21 1% 0% -- -0% Atest 1.21 1% 0% 0% -- # c. Testing 50 iterations of 25000 elements ... (warning: too few iterations for a reliable count) (warning: too few iterations for a reliable count) (warning: too few iterations for a reliable count) (warning: too few iterations for a reliable count) Rate Atest Btest Dtest Ctest Atest 1.23/s -- -23% -25% -27% Btest 1.61/s 31% -- -2% -5% Dtest 1.64/s 33% 2% -- -3% Ctest 1.69/s 37% 5% 3% -- Rate Atest Btest Dtest Ctest Atest 1.67/s -- -0% -0% -0% Btest 1.67/s 0% -- -0% -0% Dtest 1.67/s 0% 0% -- -0% Ctest 1.67/s 0% 0% 0% -- # d. Testing 50 iterations of 50000 elements ... (warning: too few iterations for a reliable count) (warning: too few iterations for a reliable count) (warning: too few iterations for a reliable count) (warning: too few iterations for a reliable count) s/iter Atest Dtest Ctest Btest Atest 1.61 -- -24% -25% -25% Dtest 1.22 32% -- -1% -2% Ctest 1.21 33% 1% -- -1% Btest 1.20 34% 2% 1% -- s/iter Atest Ctest Btest Dtest Atest 1.21 -- -0% -0% -1% Ctest 1.21 0% -- -0% -1% Btest 1.21 0% 0% -- -1% Dtest 1.20 1% 1% 1% -- # e. Testing 100 iterations of 25000 elements ... (warning: too few iterations for a reliable count) (warning: too few iterations for a reliable count) (warning: too few iterations for a reliable count) (warning: too few iterations for a reliable count) Rate Atest Ctest Dtest Btest Atest 1.23/s -- -26% -28% -28% Ctest 1.67/s 35% -- -3% -3% Dtest 1.72/s 40% 3% -- -0% Btest 1.72/s 40% 3% 0% -- Rate Dtest Ctest Btest Atest Dtest 1.66/s -- -0% -1% -1% Ctest 1.67/s 0% -- -1% -1% Btest 1.68/s 1% 1% -- -0% Atest 1.68/s 1% 1% 0% -- # f. Testing 100 iterations of 50000 elements ... (warning: too few iterations for a reliable count) (warning: too few iterations for a reliable count) (warning: too few iterations for a reliable count) (warning: too few iterations for a reliable count) s/iter Atest Ctest Dtest Btest Atest 1.64 -- -26% -26% -27% Ctest 1.22 34% -- -1% -2% Dtest 1.21 36% 1% -- -2% Btest 1.19 38% 3% 2% -- s/iter Btest Ctest Atest Dtest Btest 1.21 -- -0% -0% -0% Ctest 1.21 0% -- -0% -0% Atest 1.21 0% 0% -- -0% Dtest 1.20 0% 0% 0% -- [download] System Info: same as in previous posting	[reply] [d/l] [select]

Replies are listed 'Best First'.

Re^6: Benchmark.pm: Does subroutine testing order bias results?
by jkeenan1 (Deacon) on Jul 19, 2004 at 21:30 UTC

The results were exactly as you predicted. Here is a set of tests of cmpthese() which parallels the results I posted earlier from runs on Win2K and Darwin. I will now try to adapt this approach to my original problem. Thanks for taking the time to look at this.

#!/usr/local/bin/perl
use strict;
use warnings;
use Benchmark qw[ timethese cmpthese ];

# Usage: buk.pl iterations records

die "Need 2 numeric command-line arguments: $!" unless (
    @ARGV == 2 and
    ($ARGV[0] =~ /^\d+$/ and $ARGV[0] > 0) and
    ($ARGV[1] =~ /^\d+$/ and $ARGV[1] > 0)
);

my ($iterations, $records) = @ARGV;

print "\n#   .  Testing $iterations iterations of $records elements ..
+.\n\n";

my %tests = (
    Atest => \&test,
    Btest => \&test,
    Ctest => \&test,
    Dtest => \&test,
);

cmpthese( 1 , \%tests); # to clear up memory per browseruk
cmpthese( $iterations, \%tests);

sub test {
    my @strings = map{ ' ' x 1000 } 1 .. $records;
}


__END__

# 1.
# Testing BrowserUk's second version of script intended
# to work around problems in &Benchmark::cmpthese
# Perl 5.8.4; Darwin (Mac OS X, version 10.3)

#   a.  Testing 5 iterations of 25000 elements ...

            (warning: too few iterations for a reliable count)
            (warning: too few iterations for a reliable count)
            (warning: too few iterations for a reliable count)
            (warning: too few iterations for a reliable count)
        Rate Dtest Ctest Btest Atest
Dtest 1.05/s    --   -2%   -3%  -64%
Ctest 1.07/s    2%    --   -2%  -63%
Btest 1.08/s    3%    2%    --  -63%
Atest 2.91/s  177%  173%  168%    --
        Rate Ctest Btest Atest Dtest
Ctest 1.04/s    --   -0%   -0%   -0%
Btest 1.04/s    0%    --    0%   -0%
Atest 1.04/s    0%    0%    --   -0%
Dtest 1.04/s    0%    0%    0%    --


#   b.  Testing 5 iterations of 50000 elements ...

            (warning: too few iterations for a reliable count)
            (warning: too few iterations for a reliable count)
            (warning: too few iterations for a reliable count)
            (warning: too few iterations for a reliable count)
      s/iter Ctest Dtest Btest Atest
Ctest   5.70    --   -0%   -0%  -85%
Dtest   5.70    0%    --   -0%  -85%
Btest   5.69    0%    0%    --  -85%
Atest  0.859  564%  564%  562%    --
      s/iter Ctest Btest Dtest Atest
Ctest   5.79    --   -0%   -1%   -1%
Btest   5.76    0%    --   -1%   -1%
Dtest   5.72    1%    1%    --   -0%
Atest   5.71    1%    1%    0%    --


#   c.  Testing 50 iterations of 25000 elements ...

            (warning: too few iterations for a reliable count)
            (warning: too few iterations for a reliable count)
            (warning: too few iterations for a reliable count)
            (warning: too few iterations for a reliable count)
        Rate Dtest Ctest Btest Atest
Dtest 1.05/s    --   -2%   -3%  -64%
Ctest 1.07/s    2%    --   -2%  -63%
Btest 1.08/s    3%    2%    --  -63%
Atest 2.91/s  177%  173%  168%    --
        Rate Ctest Btest Dtest Atest
Ctest 1.02/s    --   -0%   -1%   -1%
Btest 1.02/s    0%    --   -1%   -1%
Dtest 1.03/s    1%    1%    --   -1%
Atest 1.04/s    1%    1%    1%    --



#   d.  Testing 50 iterations of 50000 elements ...

            (warning: too few iterations for a reliable count)
            (warning: too few iterations for a reliable count)
            (warning: too few iterations for a reliable count)
            (warning: too few iterations for a reliable count)
      s/iter Dtest Ctest Btest Atest
Dtest   5.70    --   -0%   -0%  -85%
Ctest   5.70    0%    --   -0%  -85%
Btest   5.69    0%    0%    --  -85%
Atest  0.859  564%  564%  562%    --
      s/iter Dtest Atest Ctest Btest
Dtest   5.75    --   -0%   -1%   -1%
Atest   5.73    0%    --   -0%   -1%
Ctest   5.72    1%    0%    --   -0%
Btest   5.70    1%    1%    0%    --


#   e.  Testing 100 iterations of 25000 elements ...

            (warning: too few iterations for a reliable count)
            (warning: too few iterations for a reliable count)
            (warning: too few iterations for a reliable count)
            (warning: too few iterations for a reliable count)
        Rate Ctest Dtest Btest Atest
Ctest 1.05/s    --   -0%   -3%  -62%
Dtest 1.05/s    0%    --   -3%  -62%
Btest 1.08/s    3%    3%    --  -61%
Atest 2.78/s  165%  165%  156%    --
        Rate Atest Btest Dtest Ctest
Atest 1.03/s    --   -0%   -0%   -1%
Btest 1.03/s    0%    --   -0%   -1%
Dtest 1.03/s    0%    0%    --   -0%
Ctest 1.04/s    1%    1%    0%    --


#   f.  Testing 100 iterations of 50000 elements ...

            (warning: too few iterations for a reliable count)
            (warning: too few iterations for a reliable count)
            (warning: too few iterations for a reliable count)
            (warning: too few iterations for a reliable count)
      s/iter Ctest Dtest Btest Atest
Ctest   5.70    --   -0%   -0%  -85%
Dtest   5.70    0%    --   -0%  -85%
Btest   5.70    0%    0%    --  -85%
Atest  0.875  552%  552%  552%    --
      s/iter Btest Atest Ctest Dtest
Btest   5.75    --   -1%   -1%   -1%
Atest   5.72    1%    --   -0%   -0%
Ctest   5.72    1%    0%    --   -0%
Dtest   5.70    1%    0%    0%    --

# 2.
# Testing BrowserUk's second version of script intended
# to work around problems in &Benchmark::cmpthese
# Perl 5.8.0; Windows2000 Professional

#   a.  Testing 5 iterations of 25000 elements ...

            (warning: too few iterations for a reliable count)
            (warning: too few iterations for a reliable count)
            (warning: too few iterations for a reliable count)
            (warning: too few iterations for a reliable count)
        Rate Atest Dtest Btest Ctest
Atest 1.20/s    --  -27%  -28%  -28%
Dtest 1.64/s   36%    --   -2%   -2%
Btest 1.67/s   38%    2%    --   -0%
Ctest 1.67/s   38%    2%    0%    --
        Rate Btest Atest Ctest Dtest
Btest 1.65/s    --   -0%   -1%   -1%
Atest 1.66/s    0%    --   -0%   -1%
Ctest 1.66/s    1%    0%    --   -1%
Dtest 1.67/s    1%    1%    1%    --

#   b.  Testing 5 iterations of 50000 elements ...

            (warning: too few iterations for a reliable count)
            (warning: too few iterations for a reliable count)
            (warning: too few iterations for a reliable count)
            (warning: too few iterations for a reliable count)
      s/iter Atest Ctest Dtest Btest
Atest   1.61    --  -25%  -25%  -26%
Ctest   1.20   34%    --    0%   -1%
Dtest   1.20   34%    0%    --   -1%
Btest   1.19   35%    1%    1%    --
      s/iter Ctest Dtest Btest Atest
Ctest   1.22    --   -0%   -1%   -1%
Dtest   1.21    0%    --   -0%   -0%
Btest   1.21    1%    0%    --   -0%
Atest   1.21    1%    0%    0%    --

#   c.  Testing 50 iterations of 25000 elements ...

            (warning: too few iterations for a reliable count)
            (warning: too few iterations for a reliable count)
            (warning: too few iterations for a reliable count)
            (warning: too few iterations for a reliable count)
        Rate Atest Btest Dtest Ctest
Atest 1.23/s    --  -23%  -25%  -27%
Btest 1.61/s   31%    --   -2%   -5%
Dtest 1.64/s   33%    2%    --   -3%
Ctest 1.69/s   37%    5%    3%    --
        Rate Atest Btest Dtest Ctest
Atest 1.67/s    --   -0%   -0%   -0%
Btest 1.67/s    0%    --   -0%   -0%
Dtest 1.67/s    0%    0%    --   -0%
Ctest 1.67/s    0%    0%    0%    --

#   d.  Testing 50 iterations of 50000 elements ...

            (warning: too few iterations for a reliable count)
            (warning: too few iterations for a reliable count)
            (warning: too few iterations for a reliable count)
            (warning: too few iterations for a reliable count)
      s/iter Atest Dtest Ctest Btest
Atest   1.61    --  -24%  -25%  -25%
Dtest   1.22   32%    --   -1%   -2%
Ctest   1.21   33%    1%    --   -1%
Btest   1.20   34%    2%    1%    --
      s/iter Atest Ctest Btest Dtest
Atest   1.21    --   -0%   -0%   -1%
Ctest   1.21    0%    --   -0%   -1%
Btest   1.21    0%    0%    --   -1%
Dtest   1.20    1%    1%    1%    --

#   e.  Testing 100 iterations of 25000 elements ...

            (warning: too few iterations for a reliable count)
            (warning: too few iterations for a reliable count)
            (warning: too few iterations for a reliable count)
            (warning: too few iterations for a reliable count)
        Rate Atest Ctest Dtest Btest
Atest 1.23/s    --  -26%  -28%  -28%
Ctest 1.67/s   35%    --   -3%   -3%
Dtest 1.72/s   40%    3%    --   -0%
Btest 1.72/s   40%    3%    0%    --
        Rate Dtest Ctest Btest Atest
Dtest 1.66/s    --   -0%   -1%   -1%
Ctest 1.67/s    0%    --   -1%   -1%
Btest 1.68/s    1%    1%    --   -0%
Atest 1.68/s    1%    1%    0%    --

#   f.  Testing 100 iterations of 50000 elements ...

            (warning: too few iterations for a reliable count)
            (warning: too few iterations for a reliable count)
            (warning: too few iterations for a reliable count)
            (warning: too few iterations for a reliable count)
      s/iter Atest Ctest Dtest Btest
Atest   1.64    --  -26%  -26%  -27%
Ctest   1.22   34%    --   -1%   -2%
Dtest   1.21   36%    1%    --   -2%
Btest   1.19   38%    3%    2%    --
      s/iter Btest Ctest Atest Dtest
Btest   1.21    --   -0%   -0%   -0%
Ctest   1.21    0%    --   -0%   -0%
Atest   1.21    0%    0%    --   -0%
Dtest   1.20    0%    0%    0%    --
[download]

System Info: same as in previous posting

[reply]
[d/l]
[select]