in reply to Challenge: CPU-optimized byte-wise or-equals (for a meter of beer)

I've been testing the solutions posted so far along with my own. I post it here just for completeness since others are faster:

sub kyle { my ($s1, $s2) = @_; my $out = $s1; while ( $s1 =~ m/\000/g ) { my $pos = pos; substr( $out, $pos, 1 ) = substr( $s2, $pos, 1 ); } return $out; }

I changed the test script slightly so that it checks that the subs in question actually work (they do):

my $subs = { 'split1' => sub { my $s3 = split1( $s1, $s2 ) }, 'substr1' => sub { my $s3 = substr1( $s1, $s2 ) }, 'kyle' => sub { my $s3 = kyle( $s1, $s2 ) }, 'moritz' => sub { my $s3 = moritz( $s1, $s2 ) }, 'corion' => sub { my $s3 = corion( $s1, $s2 ) }, 'ikegami_s' => sub { my $s3 = ikegami_s( $s1, $s2 ) }, 'ikegami_tr' => sub { my $s3 = ikegami_tr( $s1, $s2 ) }, 'avar' => sub { my $s3 = avar( $s1, $s2 ) }, }; cmpthese( -2, $subs ); use Test::More; plan 'tests' => scalar keys %{$subs}; my $s3; foreach my $subname ( keys %{$subs} ) { my $sub = $subs->{$subname}; if ( defined $s3 ) { is( $sub->(), $s3, "$subname gets same value" ); } else { $s3 = $sub->(); ok( defined $s3, "$subname gets some value" ); } }

Since moritz didn't actually post code, I wrote something based on the suggestion:

sub moritz { my ($s1, $s2) = @_; my $pos = 0; while ( 0 < ( $pos = index $s1, "\000", $pos ) ) { substr( $s1, $pos, 1 ) = substr( $s2, $pos, 1 ); } return $s1; }

This was the fastest on my machine until avar posted a second version. (cmpthese lines truncated because they're so ugly.)

split1 5.94/s substr1 41.3/s ikegami_s 62.0/s ikegami_tr 2111/s corion 4342/s kyle 5608/s avar 5635/s moritz 6305/s avar2 6334/s 1..9 ok 1 - corion gets some value ok 2 - substr1 gets same value ok 3 - kyle gets same value ok 4 - split1 gets same value ok 5 - moritz gets same value ok 6 - ikegami_tr gets same value ok 7 - ikegami_s gets same value ok 8 - avar gets same value ok 9 - avar2 gets same value

Replies are listed 'Best First'.
Re^2: Challenge: CPU-optimized byte-wise or-equals (for a meter of beer)
by avar (Beadle) on Sep 12, 2007 at 14:47 UTC
    Do you have the complete file that produced that handy (with all the implementations). And if so could you post it please?:)

      Sure. This also incorporates the better testing from SuicideJunkie. With that change, it seems that moritz is back on top. The speed of a particular algorithm can be influenced pretty heavily by the input data. That is, it's influenced by what branches it has to take during execution...

        I've made another implementation that uses pos() instead of captures to get the string position, this makes a big difference on 5.8 but not as big a difference on 5.10 it seems. As SuicideJunkie points out the current benchmark doesn't really do the in-place implementations justice, here are the results of the updated benchmark:
        Rate split1 substr1 ikegami_s avar avar2 ikegami_ +tr avar2_pos corion moritz avar2_pos_inplace split1 5.05/s -- -82% -89% -99% -99% -10 +0% -100% -100% -100% -100% substr1 27.8/s 452% -- -38% -96% -96% -9 +7% -98% -99% -99% -99% ikegami_s 45.1/s 794% 62% -- -94% -94% -9 +6% -97% -98% -99% -99% avar 733/s 14427% 2534% 1525% -- -7% -2 +8% -50% -65% -82% -86% avar2 790/s 15549% 2737% 1651% 8% -- -2 +3% -46% -63% -80% -85% ikegami_tr 1022/s 20164% 3574% 2167% 39% 29% +-- -30% -52% -74% -80% avar2_pos 1463/s 28893% 5157% 3144% 100% 85% 4 +3% -- -31% -63% -71% corion 2123/s 41978% 7529% 4608% 190% 169% 10 +8% 45% -- -47% -58% moritz 3978/s 78730% 14193% 8720% 443% 404% 28 +9% 172% 87% -- -22% avar2_pos_inplace 5095/s 100879% 18208% 11198% 595% 545% 39 +8% 248% 140% 28% --
        Updated benchmark script:
        #!/usr/bin/perl use 5.6.0; use strict; use warnings FATAL => 'all'; use Benchmark qw( cmpthese ); my $s1 = do_rand(0, 100_000); my $s2 = do_rand(1, 100_000); my $subs = { 'split1' => sub { my $s3 = split1( $s1, $s2 ) }, 'substr1' => sub { my $s3 = substr1( $s1, $s2 ) }, # 'kyle' => sub { my $s3 = kyle( $s1, $s2 ) }, 'moritz' => sub { my $s3 = moritz( $s1, $s2 ) }, 'corion' => sub { my $s3 = corion( $s1, $s2 ) }, 'ikegami_s' => sub { my $s3 = ikegami_s( $s1, $s2 ) }, 'ikegami_tr' => sub { my $s3 = ikegami_tr( $s1, $s2 ) }, 'avar' => sub { my $s3 = avar( $s1, $s2 ) }, 'avar2' => sub { my $s3 = avar2( $s1, $s2 ) }, 'avar2_pos' => sub { my $s3 = avar2_pos( $s1, $s2 ) }, 'avar2_pos_inplace' => sub { avar2_pos_inplace( \$s1, $s2 ); $s1 +}, }; cmpthese( -2, $subs ); use Test::More; plan 'tests' => scalar keys %{$subs}; my $s3; foreach my $subname ( keys %{$subs} ) { my $sub = $subs->{$subname}; if ( defined $s3 ) { is( $sub->(), $s3, "$subname gets same value" ); } else { $s3 = $sub->(); ok( defined $s3, "$subname gets some value" ); } } sub split1 { my ($s1, $s2) = @_; my @s1 = split //, $s1; my @s2 = split //, $s2; foreach my $idx ( 0 .. $#s1 ) { if ( $s1[$idx] eq chr(0) ) { $s1[$idx] = $s2[$idx]; } } return join '', @s1; } sub substr1 { my ($s1, $s2) = @_; for my $idx ( 0 .. length($s1) ) { if ( substr($s1,$idx,1) eq chr(0) ) { substr($s1, $idx, 1) = substr($s2, $idx, 1); } } return $s1; } sub kyle { my ($s1, $s2) = @_; my $out = $s1; while ( $s1 =~ m/\000/g ) { my $pos = pos; substr( $out, $pos, 1 ) = substr( $s2, $pos, 1 ); } return $out; } sub moritz { my ($s1, $s2) = @_; my $pos = 0; while ( 0 < ( $pos = index $s1, "\000", $pos ) ) { substr( $s1, $pos, 1 ) = substr( $s2, $pos, 1 ); } return $s1; } sub ikegami_s { my ($s1, $s2) = @_; (my $mask = $s1) =~ s/[^\x00]/\xFF/g; return ($s1 & $mask) | ($s2 & ~$mask); } sub ikegami_tr { my ($s1, $s2) = @_; (my $mask = $s1) =~ tr/\x00/\xFF/c; return ($s1 & $mask) | ($s2 & ~$mask); } sub corion { my ($s1, $s2) = @_; my $ofs = 0; return join "", map { $ofs += length; $_ => substr $s2, $ofs++, 1 } + split /\0/, $s1, -1; } sub avar { my ($s1, $s2) = @_; my $s3 = $s1; { use bytes; $s3 =~ s/(\0)/substr $s2, $+[0]-1, 1/eg; } $s3; } sub avar2 { my ($s1, $s2) = @_; use bytes; $s1 =~ s/(\0)/substr $s2, $+[0]-1, 1/eg; return $s1; } sub avar2_pos { my ($s1, $s2) = @_; use bytes; $s1 =~ s/\0/substr $s2, pos($s1), 1/eg; return $s1; } sub avar2_pos_inplace { my ($s1, $s2) = @_; use bytes; $$s1 =~ s/\0/substr $s2, pos($$s1), 1/eg; } # This makes sure that $s1 has chr(0)'s in it and $s2 does not. sub do_rand { my $min = shift; my $len = shift; my $n = ""; for (1 .. $len) { $n .= chr( rand(255-$min)+$min ); } return $n; } #sub do_rand { # my $n = (shift) ? int(rand(255)) : int(rand(254)) + 1; # return chr( $n ); #} __END__

        It occurs to me that the in-place algorithms are not getting credited for their in-placeness, since they are all returning values and thus waste time making copies of the data.

        Some of these can probably be sped up further just by not returning anything since they change $s1 directly as the OP suggested.