in reply to Should multiplication by integer be avoided in favour of division for performance reasons?

Use strings instead of subs and the benchmark numbers will be closer to reality
  • Comment on Re: Should multiplication by integer be avoided in favour of division for performance reasons?

Replies are listed 'Best First'.
Re^2: Should multiplication by integer be avoided in favour of division for performance reasons? (benchmark pitfalls)
by Anonymous Monk on Nov 29, 2019 at 03:37 UTC

    Its your basic benchmark pitfall, the overhead overshadows the thing you're timing

    #!/usr/bin/perl -- use strict; use warnings; use Benchmark 'cmpthese'; my @a = map log, 1..1e6; cmpthese( -2, { 1 => sub {[ map $_ * 4, @a ]}, 2 => sub {[ map $_ / (1/4), @a ]}, }); cmpthese( -2, { 1 => sub {[ map $_ * 4, @a ]}, 2 => sub {[ map $_ / (1/4), @a ]}, }); cmpthese( -2, { 1 => sub {[ map $_ * 4, @a ]}, 2 => sub {[ map $_ / (1/4), @a ]}, }); cmpthese( -2, { 1 => '[ map $_ * 4, @a ]', 2 => '[ map $_ / (1/4), @a ]', }); cmpthese( -2, { 1 => '[ map $_ * 4, @a ]', 2 => '[ map $_ / (1/4), @a ]', }); cmpthese( -2, { 1 => '[ map $_ * 4, @a ]', 2 => '[ map $_ / (1/4), @a ]', }); __END__

    But as you can see, there is no difference between the two

        Rate    1    2
    1 1.73/s   -- -18%
    2 2.11/s  22%   --
        Rate    1    2
    1 1.70/s   -- -19%
    2 2.11/s  24%   --
        Rate    1    2
    1 1.75/s   -- -17%
    2 2.10/s  20%   --
           Rate   1   2
    1 2371151/s  -- -3%
    2 2438111/s  3%  --
           Rate   2   1
    2 2272556/s  -- -8%
    1 2473993/s  9%  --
           Rate   2   1
    2 2140327/s  -- -8%
    1 2313868/s  8%  --
    

      I know there are overheads to calling subs, but did not think they were sufficiently large that inline code runs millions of times per second while code refs run at fewer than ten times per second.

      Purely out of curiosity, are the process and overheads documented anywhere? I can think of things like setting up @_ and other bookkeeping processes, but perhaps there are other steps.

        When you call a sub, Perl needs to store the current @_ on the stack, along with data for caller in case that's called, needs to track call context (scalar, list, or void), plus the address of where to return to when the sub call has completed. Some of this can be avoided by using goto. Method calls are even more work because you need to look up at run time which sub to call, including traversing @ISA.

        I'm pretty sure a lot of this is documented in Perl's XS documentation because calling a sub in XS is pretty manual and you have to do a lot of this yourself in C code. (Though there are macros to simplify it.)

        Sub calls in Perl are one of the most time-expensive built-in operations that doesn't involve the filesystem or network.

        In Type::Tiny, I go to ridiculous lengths to avoid sub calls. Like if you use a type constraint like ArrayRef[Int] you might think that there would be one sub to check that something is an arrayref, and then that would call the sub to check ints once for each element of the arrayref. But...

        use Types::Standard qw( Int ArrayRef ); my $type = ArrayRef[ Int->where('$_ >= 0') ]; my $check = $type->compiled_check; # The following check is ONE sub call. Just one. if ($check->(\@somearray)) { ...; }

        You might be interested in Sub::Block which automates some inlining stuff, especially with grep and map.

      There is no @main::a. You are benchmarking couple of no-ops.

      use strict; use warnings; use Benchmark qw/ cmpthese timeit /; our @b = my @a = map log, 1..1e6; timeit( 1, 'print "\$#a = $#a\n";' ); timeit( 1, 'print "\$#b = $#b\n";' ); cmpthese( -2, { a => '[ map $_ * 4, @a ]', b => '[ map $_ * 4, @b ]', }); cmpthese( -2, { 1 => '[ map $_ * 4, @b ]', 2 => '[ map $_ / (1/4), @b ]', }); __END__ $#a = -1 $#b = 999999 Rate b a b 7.70/s -- -100% a 10680591/s 138780824% -- Rate 1 2 1 7.76/s -- -28% 2 10.7/s 39% --
        The issue only occurs when a specific multiply op (i.e. a particular '*' on a particular line) is called multiple times, and on one occasion returns an integer result, and on subsequent other occasions returns a float. When this occurs the PADTMP (the private variable that the op uses to return its result) gets upgraded from an SVt_IV to a SVt_PVNV, which is the smallest type that can hold both an integer and a float. As it happens it can also potentially hold a string, and because of this, it is more expensive to free. So the extra time you're seeing in the benchmarks is just due to freeing the temporary array's now-more-complex elements. You can see a similar effect here, which involves no arithmetic:
        use Benchmark 'cmpthese'; my $x = 1; $x = 1.1; # $x promoted to PVNV our @a = map $x, 1..1e6; my $y = 1.1; our @b = map $y, 1..1e6; cmpthese( -2, { 1 => '{my @c = @a}', 2 => '{my @c = @b}', __END__ Rate 1 2 1 26.7/s -- -35% 2 40.9/s 53% --

        As it happens, perl's multiply operator is optimised to handle int*int and float*float quickly; other permutations like int*float and float*string take slower paths. As it also happens, if you do $float * 4, the constant 4 is internally upgraded to hold both an IV and NV value, so subsequent iterations take the fast float*float code path.

        Dave.