Re: Challenge: CPU-optimized byte-wise or-equals (for a meter of beer)

I've been testing the solutions posted so far along with my own. I post it here just for completeness since others are faster:

sub kyle {
    my ($s1, $s2) = @_;

    my $out = $s1;
    while ( $s1 =~ m/\000/g ) {
        my $pos = pos;
        substr( $out, $pos, 1 ) = substr( $s2, $pos, 1 );
    }

    return $out;
}
[download]

I changed the test script slightly so that it checks that the subs in question actually work (they do):

my $subs =  {
    'split1'     => sub { my $s3 = split1( $s1, $s2 ) },
    'substr1'    => sub { my $s3 = substr1( $s1, $s2 ) },
    'kyle'       => sub { my $s3 = kyle( $s1, $s2 ) },
    'moritz'     => sub { my $s3 = moritz( $s1, $s2 ) },
    'corion'     => sub { my $s3 = corion( $s1, $s2 ) },
    'ikegami_s'  => sub { my $s3 = ikegami_s( $s1, $s2 ) },
    'ikegami_tr' => sub { my $s3 = ikegami_tr( $s1, $s2 ) },
    'avar'       => sub { my $s3 = avar( $s1, $s2 ) },
};

cmpthese( -2, $subs );

use Test::More;

plan 'tests' => scalar keys %{$subs};

my $s3;
foreach my $subname ( keys %{$subs} ) {
    my $sub = $subs->{$subname};
    if ( defined $s3 ) {
        is( $sub->(), $s3, "$subname gets same value" );
    }
    else {
        $s3 = $sub->();
        ok( defined $s3, "$subname gets some value" );
    }
}
[download]

Since moritz didn't actually post code, I wrote something based on the suggestion:

sub moritz {
    my ($s1, $s2) = @_;

    my $pos = 0;
    while ( 0 < ( $pos = index $s1, "\000", $pos ) ) {
        substr( $s1, $pos, 1 ) = substr( $s2, $pos, 1 );
    }

    return $s1;
}
[download]

This was the fastest on my machine until avar posted a second version. (cmpthese lines truncated because they're so ugly.)

split1     5.94/s
substr1    41.3/s
ikegami_s  62.0/s
ikegami_tr 2111/s
corion     4342/s
kyle       5608/s
avar       5635/s
moritz     6305/s
avar2      6334/s
1..9
ok 1 - corion gets some value
ok 2 - substr1 gets same value
ok 3 - kyle gets same value
ok 4 - split1 gets same value
ok 5 - moritz gets same value
ok 6 - ikegami_tr gets same value
ok 7 - ikegami_s gets same value
ok 8 - avar gets same value
ok 9 - avar2 gets same value
[download]

Comment on Re: Challenge: CPU-optimized byte-wise or-equals (for a meter of beer) Select or Download Code

Replies are listed 'Best First'.
Re^2: Challenge: CPU-optimized byte-wise or-equals (for a meter of beer) by avar (Beadle) on Sep 12, 2007 at 14:47 UTC
Do you have the complete file that produced that handy (with all the implementations). And if so could you post it please?:)	[reply]
Re^3: Challenge: CPU-optimized byte-wise or-equals (for a meter of beer) by kyle (Abbot) on Sep 12, 2007 at 14:52 UTC
Sure. This also incorporates the better testing from SuicideJunkie. With that change, it seems that moritz is back on top. The speed of a particular algorithm can be influenced pretty heavily by the input data. That is, it's influenced by what branches it has to take during execution... Read more... (3 kB)	[reply] [d/l]
Re^4: Challenge: CPU-optimized byte-wise or-equals (for a meter of beer) by avar (Beadle) on Sep 12, 2007 at 15:06 UTC
I've made another implementation that uses pos() instead of captures to get the string position, this makes a big difference on 5.8 but not as big a difference on 5.10 it seems. As SuicideJunkie points out the current benchmark doesn't really do the in-place implementations justice, here are the results of the updated benchmark: Rate split1 substr1 ikegami_s avar avar2 ikegami_ +tr avar2_pos corion moritz avar2_pos_inplace split1 5.05/s -- -82% -89% -99% -99% -10 +0% -100% -100% -100% -100% substr1 27.8/s 452% -- -38% -96% -96% -9 +7% -98% -99% -99% -99% ikegami_s 45.1/s 794% 62% -- -94% -94% -9 +6% -97% -98% -99% -99% avar 733/s 14427% 2534% 1525% -- -7% -2 +8% -50% -65% -82% -86% avar2 790/s 15549% 2737% 1651% 8% -- -2 +3% -46% -63% -80% -85% ikegami_tr 1022/s 20164% 3574% 2167% 39% 29% +-- -30% -52% -74% -80% avar2_pos 1463/s 28893% 5157% 3144% 100% 85% 4 +3% -- -31% -63% -71% corion 2123/s 41978% 7529% 4608% 190% 169% 10 +8% 45% -- -47% -58% moritz 3978/s 78730% 14193% 8720% 443% 404% 28 +9% 172% 87% -- -22% avar2_pos_inplace 5095/s 100879% 18208% 11198% 595% 545% 39 +8% 248% 140% 28% -- [download] Updated benchmark script: #!/usr/bin/perl use 5.6.0; use strict; use warnings FATAL => 'all'; use Benchmark qw( cmpthese ); my $s1 = do_rand(0, 100_000); my $s2 = do_rand(1, 100_000); my $subs = { 'split1' => sub { my $s3 = split1( $s1, $s2 ) }, 'substr1' => sub { my $s3 = substr1( $s1, $s2 ) }, # 'kyle' => sub { my $s3 = kyle( $s1, $s2 ) }, 'moritz' => sub { my $s3 = moritz( $s1, $s2 ) }, 'corion' => sub { my $s3 = corion( $s1, $s2 ) }, 'ikegami_s' => sub { my $s3 = ikegami_s( $s1, $s2 ) }, 'ikegami_tr' => sub { my $s3 = ikegami_tr( $s1, $s2 ) }, 'avar' => sub { my $s3 = avar( $s1, $s2 ) }, 'avar2' => sub { my $s3 = avar2( $s1, $s2 ) }, 'avar2_pos' => sub { my $s3 = avar2_pos( $s1, $s2 ) }, 'avar2_pos_inplace' => sub { avar2_pos_inplace( \$s1, $s2 ); $s1 +}, }; cmpthese( -2, $subs ); use Test::More; plan 'tests' => scalar keys %{$subs}; my $s3; foreach my $subname ( keys %{$subs} ) { my $sub = $subs->{$subname}; if ( defined $s3 ) { is( $sub->(), $s3, "$subname gets same value" ); } else { $s3 = $sub->(); ok( defined $s3, "$subname gets some value" ); } } sub split1 { my ($s1, $s2) = @_; my @s1 = split //, $s1; my @s2 = split //, $s2; foreach my $idx ( 0 .. $#s1 ) { if ( $s1[$idx] eq chr(0) ) { $s1[$idx] = $s2[$idx]; } } return join '', @s1; } sub substr1 { my ($s1, $s2) = @_; for my $idx ( 0 .. length($s1) ) { if ( substr($s1,$idx,1) eq chr(0) ) { substr($s1, $idx, 1) = substr($s2, $idx, 1); } } return $s1; } sub kyle { my ($s1, $s2) = @_; my $out = $s1; while ( $s1 =~ m/\000/g ) { my $pos = pos; substr( $out, $pos, 1 ) = substr( $s2, $pos, 1 ); } return $out; } sub moritz { my ($s1, $s2) = @_; my $pos = 0; while ( 0 < ( $pos = index $s1, "\000", $pos ) ) { substr( $s1, $pos, 1 ) = substr( $s2, $pos, 1 ); } return $s1; } sub ikegami_s { my ($s1, $s2) = @_; (my $mask = $s1) =~ s/[^\x00]/\xFF/g; return ($s1 & $mask) \| ($s2 & ~$mask); } sub ikegami_tr { my ($s1, $s2) = @_; (my $mask = $s1) =~ tr/\x00/\xFF/c; return ($s1 & $mask) \| ($s2 & ~$mask); } sub corion { my ($s1, $s2) = @_; my $ofs = 0; return join "", map { $ofs += length; $_ => substr $s2, $ofs++, 1 } + split /\0/, $s1, -1; } sub avar { my ($s1, $s2) = @_; my $s3 = $s1; { use bytes; $s3 =~ s/(\0)/substr $s2, $+[0]-1, 1/eg; } $s3; } sub avar2 { my ($s1, $s2) = @_; use bytes; $s1 =~ s/(\0)/substr $s2, $+[0]-1, 1/eg; return $s1; } sub avar2_pos { my ($s1, $s2) = @_; use bytes; $s1 =~ s/\0/substr $s2, pos($s1), 1/eg; return $s1; } sub avar2_pos_inplace { my ($s1, $s2) = @_; use bytes; $$s1 =~ s/\0/substr $s2, pos($$s1), 1/eg; } # This makes sure that $s1 has chr(0)'s in it and $s2 does not. sub do_rand { my $min = shift; my $len = shift; my $n = ""; for (1 .. $len) { $n .= chr( rand(255-$min)+$min ); } return $n; } #sub do_rand { # my $n = (shift) ? int(rand(255)) : int(rand(254)) + 1; # return chr( $n ); #} __END__ [download]	[reply] [d/l] [select]
Re^4: Challenge: CPU-optimized byte-wise or-equals (for a meter of beer) by SuicideJunkie (Vicar) on Sep 12, 2007 at 15:03 UTC
It occurs to me that the in-place algorithms are not getting credited for their in-placeness, since they are all returning values and thus waste time making copies of the data. Some of these can probably be sped up further just by not returning anything since they change $s1 directly as the OP suggested.	[reply]