in reply to Re: Challenge: CPU-optimized byte-wise or-equals (for a meter of beer)
in thread Challenge: CPU-optimized byte-wise or-equals (for a meter of beer)
Update: As the above node, the implementations are broken, but pass the tests where they should not have. The theme of the node, that different perl builds are doing drastically different things with the same code, stands.
I downloaded, compiled, and installed 5.9.5 on my Linux box. I also have a few more tweaks I've tried. Here's some result summaries (the Linux box with 5.9.5 -- the first test listed -- is 30 seconds. The rest are still 2):
I should note that moritz's solution is between 50% and 75% slower than the top pure-Perl solution in all of these tests, and the rest of the ones I've tested fall below that.
I should also note that my Linux 5.8.7 does nearly twice as many iterations per second of every solution (of those faster than about 200 iterations per second anyway) than my 5.9.5 does, so I'm curious as to whether that's a development version thing or if my new perl just isn't compiled with as much optimization as the one that came with the distro. Switching to -O4 from -O2 for optimization and replacing some older x86-family lib references in the makefiles and rebuilding doesn't help much. I'm guessing the devel branch just isn't tuned at the source level as much as the stable branch, which makes sense.
Here's my code for mrm_4 and mrm_5:
sub mrm_4 { # from [bart]'s vec() my ($s1, $s2) = @_; use bytes; my $pos = 0; while ( -1 < ( $pos = index $$s1, '\0', $pos ) ) { vec( $$s1, $pos, 8 ) ||= vec( $s2, $pos, 8 ); } } sub mrm_5 { # from moritz's, seeing if four-arg substr() is # faster or slower than lvalue substr() my ( $s1, $s2 ) = @_; use bytes; my $pos = 0; while ( -1 < ( $pos = index $$s1, '\0', $pos ) ) { substr( $$s1, $pos, 1, substr( $s2, $pos, 1 ) ); } }
|
|---|