(Yes, I didn't decrement $m when I simplified for posting.)
Ok, you're both right. Both suggestions are faster than my dizzy wheel.
Bitwise-and for power-of-two sizes is clearly fastest for indexing. And
incrementing and testing for equal is clearly the most general and only
slightly more complicated than the "straightforward" (slow)
"@a == $sz and shift @a; push @a,$val;"
But..., there's still this puzzlement:
my $m=4; my $x=$m; $m--; my @a=(1..$m,0);
timethese(-10, { wheel => sub {$x = $a[$x]},
equaland => sub {++$x==$m and $x=0},
mod2and => sub {$x++; $x &= $m;}
} );
Benchmark: running equaland, mod2and, wheel, each for at least 10 CPU
+seconds...
equaland: 192120.20/s
mod2and: 182722.70/s
wheel: 170109.49/s
As expected. (Using, 4192 as array size yields only slightly slower results
and an SMP linux box yields identical proportions.)
Yet...
my $m=4; my $x=$m; $m--; my @n=(1..$m,0); my @a=();
timethese(-10, { wheelix => sub {$a[ $x = $n[$x] ] = $x },
equalandix => sub {$a[ ++$x==$m and $x=0 ] = $x },
mod2andix => sub {$a[ ++$x, $x &= $m];}
} );
equalandix: 119200.90/s
mod2andix: 142736.03/s
wheelix: 130290.06/s
...moves equaland from first to third(!?!).
It must be something like what fundflow suggests. In such a tight loop,
the code for preinc and bitand is small enough to stay in cache -- and the
code for aelem is only loaded once -- and somehow this cuts off right at
the slightly more complicated preinc, eq and and.
Or something...
p
|