Keystroke has asked for the wisdom of the Perl Monks concerning the following question:

I've been benchmarking code all morning and this doesn't seem right. By itself
tr is slower than eq in comparing 13 character strings. I understand and
completely agree with that. What is strange to me is that tr is faster in a
loop than eq. The tr command in the eq loop should never execute. What's
happening?
use strict; use warnings; use Benchmark qw(cmpthese); my $string = '0'x13; cmpthese(0, { 'eq_loop' => q(for (0..12) { next if ($string eq '0000000000000'); $_ = ($string =~ tr/1/1/); }), 'tr_loop' => q(for (0..12) { next unless ($_ = ($string =~ tr/1/1/)); }), 'eq' => q($string eq '0000000000000';), 'tr' => q($string =~ tr/1/1/;), });

Benchmark: running eq, eq_loop, tr, tr_loop, each for at least 3 CPU seconds...
        eq:  4 wallclock secs ( 3.30 usr +  0.00 sys =  3.30 CPU) @ 1173221.21/s (n=3871630)
   eq_loop:  4 wallclock secs ( 3.00 usr +  0.00 sys =  3.00 CPU) @ 16577.00/s (n=49731)
        tr:  3 wallclock secs ( 3.08 usr +  0.00 sys =  3.08 CPU) @ 536228.57/s (n=1651584)
   tr_loop:  4 wallclock secs ( 3.17 usr +  0.00 sys =  3.17 CPU) @ 18796.53/s (n=59585)
             Rate eq_loop tr_loop      tr      eq
eq_loop   16577/s      --    -12%    -97%    -99%
tr_loop   18797/s     13%      --    -96%    -98%
tr       536229/s   3135%   2753%      --    -54%
eq      1173221/s   6977%   6142%    119%      --

Replies are listed 'Best First'.
Re: tr faster than eq?
by hv (Prior) on Mar 10, 2005 at 17:57 UTC

    When the code is evalled, your lexical $string is not in scope, so all these tests are running against undef.

    If I change the declaration to our $string, and refer to $::string within all the evals I get:

    Rate tr_loop eq_loop tr eq tr_loop 96844/s -- -26% -96% -98% eq_loop 130666/s 35% -- -95% -97% tr 2436420/s 2416% 1765% -- -41% eq 4137436/s 4172% 3066% 70% --
    which seems more reasonable.

    Hugo

      That explains my benchmark code results. But I still have
      conflicting results from -Devel::SmallProf for execution
      times between tr and eq.
      count wall tm cpu time line 16183 0.197199 0.700000 368: next unless ($_ = ($_ =~ tr/1/1/)); count wall tm cpu time line 16183 0.282621 0.750000 369: next if ($_ eq '0000000000000'); 15923 0.157497 0.890000 370: $_ = ($_ =~ tr/1/1/);
      Changing the 'my' to 'our' is enough to do the trick. Changing '$string' to '$::string' isn't necessary; the Benchmark module runs the code in the package of the caller.
Re: tr faster than eq?
by thor (Priest) on Mar 10, 2005 at 17:22 UTC
    At first, I was confused. I ran your script as given and produced similar results. Then, I changed your benchmark slightly and got different results.
    use strict; use warnings; use Benchmark qw(cmpthese); my $string = '0' x 13; cmpthese( 0, { 'eq_loop_string' => q{ for ( 0 .. 12 ) { next if ( $string eq '0000000000000' ); $_ = ( $string =~ tr/1/1/ ); } }, 'eq_loop_sub' => sub{ for ( 0 .. 12 ) { next if ( $string eq '0000000000000' ); $_ = ( $string =~ tr/1/1/ ); } }, 'tr_loop_string' => q{ for ( 0 .. 12 ) { next unless ( $_ = ( $string =~ tr/1/1/ ) ); } }, 'tr_loop_sub' => sub{ for ( 0 .. 12 ) { next unless ( $_ = ( $string =~ tr/1/1/ ) ); } }, 'eq' => q($string eq '0000000000000';), 'tr' => q($string =~ tr/1/1/;), } );
    And now the results:
                        Rate eq_loop_string tr_loop_sub tr_loop_string eq_loop_sub   tr   eq
    eq_loop_string  124852/s             --         -8%           -15%        -38% -97% -98%
    tr_loop_sub     136139/s             9%          --            -7%        -32% -97% -98%
    tr_loop_string  147082/s            18%          8%             --        -27% -97% -98%
    eq_loop_sub     200481/s            61%         47%            36%          -- -95% -97%
    tr             4356550/s          3389%       3100%          2862%       2073%   -- -42%
    eq             7530073/s          5931%       5431%          5020%       3656%  73%   --
    
    Is it possible that perl was having to eval the strings each time the sub was called, thus skewing the results?

    thor

    Feel the white light, the light within
    Be your own disciple, fan the sparks of will
    For all of us waiting, your kingdom will come

      Is it possible that perl was having to eval the strings each time the sub was called, thus skewing the results?

      No, it's because they are evaluated somewhere without access to the lexical $string, so they act upon the undefined global $string. Your solution causes the test cases to be compiled where $string is still in scope.

        While what you say makes sense, why am I not getting a Use of uninitialized value in string when that code is run? warnings are turned on...that is unless it's because the eval happens in the Benchmark package and not the main package.

        thor

        Feel the white light, the light within
        Be your own disciple, fan the sparks of will
        For all of us waiting, your kingdom will come

      Is it possible that perl was having to eval the strings each time the sub was called, thus skewing the results?
      No, that's not the way how the Benchmark module works. What the module does is, once it knows how often to run the code fragments (easy if the first argument is positive, and requires some test runs if the argument is negative), it creates, as a string, a loop of the form:
      for (1 .. $n) {$code}
      where $n is the number of times to run, and $code the code you gave as an argument. And then it string evals that. There's no skewing due to re-evalling the string.

      OTOH, if you use subs as arguments, you are skewing the results, as the sub calls themselves can easily have a significant effect on the runtime of the code you want to test.

Re: tr faster than eq?
by neosamuri (Friar) on Mar 10, 2005 at 23:55 UTC
    I belive that the tr/// is doing a little optimization, by actually running the tr only once since the result won't change. Then each time there after that it is called, it only has to return the result, that it already found.
Re: tr faster than eq?
by Elijah (Hermit) on Mar 10, 2005 at 23:22 UTC
    FYI...
    tr///; is also faster then s///; for replacement. Limit on this being that tr does not remove characters therefore if your intent is too remove matching pattern then s/// must be used but if a simple change is desired then tr/// should be used.

      Of course if your intent is to remove matching pattern the s/// operator should be used; the reason is that s/// is capable of operating on patterns, and tr/// is not. tr/// operates on characters, not patterns. However, it's incorrect to say that tr/// can't remove characters. That's exactly what the /d modifier is for.


      Dave