in reply to Re: Challenge: CPU-optimized byte-wise or-equals (for a meter of beer)
in thread Challenge: CPU-optimized byte-wise or-equals (for a meter of beer)

Do you have the complete file that produced that handy (with all the implementations). And if so could you post it please?:)
  • Comment on Re^2: Challenge: CPU-optimized byte-wise or-equals (for a meter of beer)

Replies are listed 'Best First'.
Re^3: Challenge: CPU-optimized byte-wise or-equals (for a meter of beer)
by kyle (Abbot) on Sep 12, 2007 at 14:52 UTC

    Sure. This also incorporates the better testing from SuicideJunkie. With that change, it seems that moritz is back on top. The speed of a particular algorithm can be influenced pretty heavily by the input data. That is, it's influenced by what branches it has to take during execution...

      I've made another implementation that uses pos() instead of captures to get the string position, this makes a big difference on 5.8 but not as big a difference on 5.10 it seems. As SuicideJunkie points out the current benchmark doesn't really do the in-place implementations justice, here are the results of the updated benchmark:
      Rate split1 substr1 ikegami_s avar avar2 ikegami_ +tr avar2_pos corion moritz avar2_pos_inplace split1 5.05/s -- -82% -89% -99% -99% -10 +0% -100% -100% -100% -100% substr1 27.8/s 452% -- -38% -96% -96% -9 +7% -98% -99% -99% -99% ikegami_s 45.1/s 794% 62% -- -94% -94% -9 +6% -97% -98% -99% -99% avar 733/s 14427% 2534% 1525% -- -7% -2 +8% -50% -65% -82% -86% avar2 790/s 15549% 2737% 1651% 8% -- -2 +3% -46% -63% -80% -85% ikegami_tr 1022/s 20164% 3574% 2167% 39% 29% +-- -30% -52% -74% -80% avar2_pos 1463/s 28893% 5157% 3144% 100% 85% 4 +3% -- -31% -63% -71% corion 2123/s 41978% 7529% 4608% 190% 169% 10 +8% 45% -- -47% -58% moritz 3978/s 78730% 14193% 8720% 443% 404% 28 +9% 172% 87% -- -22% avar2_pos_inplace 5095/s 100879% 18208% 11198% 595% 545% 39 +8% 248% 140% 28% --
      Updated benchmark script:
      #!/usr/bin/perl use 5.6.0; use strict; use warnings FATAL => 'all'; use Benchmark qw( cmpthese ); my $s1 = do_rand(0, 100_000); my $s2 = do_rand(1, 100_000); my $subs = { 'split1' => sub { my $s3 = split1( $s1, $s2 ) }, 'substr1' => sub { my $s3 = substr1( $s1, $s2 ) }, # 'kyle' => sub { my $s3 = kyle( $s1, $s2 ) }, 'moritz' => sub { my $s3 = moritz( $s1, $s2 ) }, 'corion' => sub { my $s3 = corion( $s1, $s2 ) }, 'ikegami_s' => sub { my $s3 = ikegami_s( $s1, $s2 ) }, 'ikegami_tr' => sub { my $s3 = ikegami_tr( $s1, $s2 ) }, 'avar' => sub { my $s3 = avar( $s1, $s2 ) }, 'avar2' => sub { my $s3 = avar2( $s1, $s2 ) }, 'avar2_pos' => sub { my $s3 = avar2_pos( $s1, $s2 ) }, 'avar2_pos_inplace' => sub { avar2_pos_inplace( \$s1, $s2 ); $s1 +}, }; cmpthese( -2, $subs ); use Test::More; plan 'tests' => scalar keys %{$subs}; my $s3; foreach my $subname ( keys %{$subs} ) { my $sub = $subs->{$subname}; if ( defined $s3 ) { is( $sub->(), $s3, "$subname gets same value" ); } else { $s3 = $sub->(); ok( defined $s3, "$subname gets some value" ); } } sub split1 { my ($s1, $s2) = @_; my @s1 = split //, $s1; my @s2 = split //, $s2; foreach my $idx ( 0 .. $#s1 ) { if ( $s1[$idx] eq chr(0) ) { $s1[$idx] = $s2[$idx]; } } return join '', @s1; } sub substr1 { my ($s1, $s2) = @_; for my $idx ( 0 .. length($s1) ) { if ( substr($s1,$idx,1) eq chr(0) ) { substr($s1, $idx, 1) = substr($s2, $idx, 1); } } return $s1; } sub kyle { my ($s1, $s2) = @_; my $out = $s1; while ( $s1 =~ m/\000/g ) { my $pos = pos; substr( $out, $pos, 1 ) = substr( $s2, $pos, 1 ); } return $out; } sub moritz { my ($s1, $s2) = @_; my $pos = 0; while ( 0 < ( $pos = index $s1, "\000", $pos ) ) { substr( $s1, $pos, 1 ) = substr( $s2, $pos, 1 ); } return $s1; } sub ikegami_s { my ($s1, $s2) = @_; (my $mask = $s1) =~ s/[^\x00]/\xFF/g; return ($s1 & $mask) | ($s2 & ~$mask); } sub ikegami_tr { my ($s1, $s2) = @_; (my $mask = $s1) =~ tr/\x00/\xFF/c; return ($s1 & $mask) | ($s2 & ~$mask); } sub corion { my ($s1, $s2) = @_; my $ofs = 0; return join "", map { $ofs += length; $_ => substr $s2, $ofs++, 1 } + split /\0/, $s1, -1; } sub avar { my ($s1, $s2) = @_; my $s3 = $s1; { use bytes; $s3 =~ s/(\0)/substr $s2, $+[0]-1, 1/eg; } $s3; } sub avar2 { my ($s1, $s2) = @_; use bytes; $s1 =~ s/(\0)/substr $s2, $+[0]-1, 1/eg; return $s1; } sub avar2_pos { my ($s1, $s2) = @_; use bytes; $s1 =~ s/\0/substr $s2, pos($s1), 1/eg; return $s1; } sub avar2_pos_inplace { my ($s1, $s2) = @_; use bytes; $$s1 =~ s/\0/substr $s2, pos($$s1), 1/eg; } # This makes sure that $s1 has chr(0)'s in it and $s2 does not. sub do_rand { my $min = shift; my $len = shift; my $n = ""; for (1 .. $len) { $n .= chr( rand(255-$min)+$min ); } return $n; } #sub do_rand { # my $n = (shift) ? int(rand(255)) : int(rand(254)) + 1; # return chr( $n ); #} __END__

      It occurs to me that the in-place algorithms are not getting credited for their in-placeness, since they are all returning values and thus waste time making copies of the data.

      Some of these can probably be sped up further just by not returning anything since they change $s1 directly as the OP suggested.