Re^2: Should multiplication by integer be avoided in favour of division for performance reasons? (benchmark pitfalls)

Replies are listed 'Best First'.
Re^3: Should multiplication by integer be avoided in favour of division for performance reasons? (benchmark pitfalls) by swl (Prior) on Nov 29, 2019 at 07:30 UTC
I know there are overheads to calling subs, but did not think they were sufficiently large that inline code runs millions of times per second while code refs run at fewer than ten times per second. Purely out of curiosity, are the process and overheads documented anywhere? I can think of things like setting up @_ and other bookkeeping processes, but perhaps there are other steps.	[reply]
Re^4: Should multiplication by integer be avoided in favour of division for performance reasons? (benchmark pitfalls) by tobyink (Canon) on Nov 29, 2019 at 08:19 UTC
When you call a sub, Perl needs to store the current `@_` on the stack, along with data for `caller` in case that's called, needs to track call context (scalar, list, or void), plus the address of where to return to when the sub call has completed. Some of this can be avoided by using `goto`. Method calls are even more work because you need to look up at run time which sub to call, including traversing `@ISA`. I'm pretty sure a lot of this is documented in Perl's XS documentation because calling a sub in XS is pretty manual and you have to do a lot of this yourself in C code. (Though there are macros to simplify it.) Sub calls in Perl are one of the most time-expensive built-in operations that doesn't involve the filesystem or network. In Type::Tiny, I go to ridiculous lengths to avoid sub calls. Like if you use a type constraint like `ArrayRef[Int]` you might think that there would be one sub to check that something is an arrayref, and then that would call the sub to check ints once for each element of the arrayref. But... `use Types::Standard qw( Int ArrayRef ); my $type = ArrayRef[ Int->where('$_ >= 0') ]; my $check = $type->compiled_check; # The following check is ONE sub call. Just one. if ($check->(\@somearray)) { ...; }` [download] You might be interested in Sub::Block which automates some inlining stuff, especially with `grep` and `map`. toby döt ink	[reply] [d/l] [select]
Re^5: Should multiplication by integer be avoided in favour of division for performance reasons? (benchmark pitfalls) by swl (Prior) on Nov 29, 2019 at 22:53 UTC
Thanks for the details. I did get some improvements a while ago with some Inline::C code by pushing the loops into the C part instead of just the loop bodies, but don't recall seeing differences like those reported higher in this thread. I'll have a peruse of the XS docs and Sub::Block.	[reply]
Re^3: Should multiplication by integer be avoided in favour of division for performance reasons? (benchmark pitfalls) by vr (Curate) on Nov 29, 2019 at 08:57 UTC
There is no `@main::a`. You are benchmarking couple of no-ops. `use strict; use warnings; use Benchmark qw/ cmpthese timeit /; our @b = my @a = map log, 1..1e6; timeit( 1, 'print "\$#a = $#a\n";' ); timeit( 1, 'print "\$#b = $#b\n";' ); cmpthese( -2, { a => '[ map $_ * 4, @a ]', b => '[ map $_ * 4, @b ]', }); cmpthese( -2, { 1 => '[ map $_ * 4, @b ]', 2 => '[ map $_ / (1/4), @b ]', }); __END__ $#a = -1 $#b = 999999 Rate b a b 7.70/s -- -100% a 10680591/s 138780824% -- Rate 1 2 1 7.76/s -- -28% 2 10.7/s 39% --` [download]	[reply] [d/l] [select]
Re^4: Should multiplication by integer be avoided in favour of division for performance reasons? (benchmark pitfalls) by dave_the_m (Monsignor) on Nov 29, 2019 at 14:35 UTC
The issue only occurs when a specific multiply op (i.e. a particular '' on a particular line) is called multiple times, and on one occasion returns an integer result, and on subsequent other occasions returns a float. When this occurs the PADTMP (the private variable that the op uses to return its result) gets upgraded from an SVt_IV to a SVt_PVNV, which is the smallest type that can hold both an integer and a float. As it happens it can also potentially hold a string, and because of this, it is more expensive to free. So the extra time you're seeing in the benchmarks is just due to freeing the temporary array's now-more-complex elements. You can see a similar effect here, which involves no arithmetic: `use Benchmark 'cmpthese'; my $x = 1; $x = 1.1; # $x promoted to PVNV our @a = map $x, 1..1e6; my $y = 1.1; our @b = map $y, 1..1e6; cmpthese( -2, { 1 => '{my @c = @a}', 2 => '{my @c = @b}', __END__ Rate 1 2 1 26.7/s -- -35% 2 40.9/s 53% --` [download] As it happens, perl's multiply operator is optimised to handle intint and floatfloat quickly; other permutations like intfloat and floatstring take slower paths. As it also happens, if you do $float 4, the constant 4 is internally upgraded to hold both an IV and NV value, so subsequent iterations take the fast float*float code path. Dave.	[reply] [d/l]
Re^5: Should multiplication by integer be avoided in favour of division for performance reasons? (benchmark pitfalls) by vr (Curate) on Nov 29, 2019 at 16:24 UTC
Thank you for looking into this, dave_the_m. But there's still performance issue if integer, or float that appears to be integer, is not the first, but last element of array. Then there's just one expensive-to-free element in derived array? use strict; use warnings; use Devel::Peek; use Benchmark qw/ cmpthese timeit /; our @a = map log, 2..1e6; push @a, 0/1; print "********** a *********\n"; Dump $a[-2]; Dump $a[-1]; my @b = map $_ 4, @a; print "********** b *********\n"; Dump $b[-2]; Dump $b[-1]; cmpthese( -2, { 1 => 'my @c = map $_ 4, @a', 2 => 'my @c = map $_ / (1/4), @a', }); __END__ ********** a ******** SV = NV(0x55f0580) at 0x55f0598 REFCNT = 1 FLAGS = (NOK,pNOK) NV = 13.8155105579643 SV = NV(0x102a388) at 0x102a3a0 REFCNT = 1 FLAGS = (NOK,pNOK) NV = 0 ******** b ********** SV = NV(0xae879c8) at 0xae879e0 REFCNT = 1 FLAGS = (NOK,pNOK) NV = 55.2620422318571 SV = PVNV(0x714ec8) at 0xae879f8 REFCNT = 1 FLAGS = (IOK,pIOK) IV = 0 NV = 0 PV = 0 Rate 1 2 1 10.3/s -- -20% 2 12.8/s 25% -- [download]	[reply] [d/l]
Re^6: Should multiplication by integer be avoided in favour of division for performance reasons? (benchmark pitfalls) by dave_the_m (Monsignor) on Nov 29, 2019 at 19:32 UTC