in reply to Challenge: CPU-optimized byte-wise or-equals (for a meter of beer)

There is an important flaw in your test. The entire 100_000 character string is composed of the exact same characters. There ARE NO NULLS most of the time!

Try using this instead:

#!/usr/bin/perl use 5.6.0; use strict; use warnings FATAL => 'all'; use Benchmark qw( cmpthese ); my $s1 = do_rand(0,100_000); my $s2 = do_rand(1,100_000); my $nulls = 0; foreach my $idx ( 0 .. length($s1) ) { if ( substr($s1,$idx,1) eq chr(0) ){ $nulls++; } } print "There are $nulls nulls in S1\n"; print "Sample data: [" . substr($s1,2,10) . "]\n"; cmpthese( -2, { 'split1' => sub { my $s3 = split1( $s1, $s2 ) }, 'substr1' => sub { my $s3 = substr1( $s1, $s2 ) }, }); sub split1 { my ($s1, $s2) = @_; my @s1 = split //, $s1; my @s2 = split //, $s2; foreach my $idx ( 0 .. $#s1 ) { if ( $s1[$idx] eq chr(0) ) { $s1[$idx] = $s2[$idx]; } } return join '', @s1; } sub substr1 { my ($s1, $s2) = @_; for my $idx ( 0 .. length($s1) ) { if ( substr($s1,$idx,1) eq chr(0) ) { substr($s1, $idx, 1) = substr($s2, $idx, 1); } } return $s1; } # This makes sure that $s1 has chr(0)'s in it and $s2 does not. sub do_rand { my $min = shift; my $len = shift; my $n = ""; for (1 .. $len) { $n .= chr( rand(255-$min)+$min ) } return $n; } __END__
  • Comment on Re: Challenge: CPU-optimized byte-wise or-equals (for a meter of beer)
  • Download Code

Replies are listed 'Best First'.
Re^2: Challenge: CPU-optimized byte-wise or-equals (for a meter of beer)
by moritz (Cardinal) on Sep 12, 2007 at 14:44 UTC
    There is an important flaw in your test. The entire 100_000 character string is composed of the exact same characters. There ARE NO NULLS most of the time!

    That's only a flaw if the data that the algorithm is meant to work with has a different composition than the test data.

    Since we don't know this, we can only assume that dragonchild provided us with test data that looks like the "real" data.

    Update: I finally understood what you are saying... the testing was flawed, really.

      If that were true, then this would be close to the ultimate function:

      sub supertest { my $s1= shift; if (substr($s1,1,1) ne chr(0)) { return $s1; }else{ return shift; } }

      Seems pretty silly to me for that to be the case ;)

      On the "other" test set, my approach fares far more within my expectations, but here the number of elements returned seems to become significant over the number of string joins:

      Rate split1 substr1 map_split subst map_spl +it_join using_str_bit_ops_and_tr split1 1.74/s -- -94% -100% -100% + -100% -100% substr1 30.5/s 1650% -- -93% -94% + -94% -96% map_split 455/s 26002% 1392% -- -13% + -15% -34% subst 520/s 29765% 1607% 14% -- + -3% -25% map_split_join 534/s 30577% 1653% 18% 3% + -- -23% using_str_bit_ops_and_tr 692/s 39604% 2169% 52% 33% + 29% --

      So, you will need to always benchmark with real data! ;)

Re^2: Challenge: CPU-optimized byte-wise or-equals (for a meter of beer)
by dragonchild (Archbishop) on Sep 12, 2007 at 15:15 UTC
    Whoops! :-) Thanks for the correction.

    My criteria for good software:
    1. Does it work?
    2. Can someone else come in, make a change, and be reasonably certain no bugs were introduced?