I've also added a few more tests from tassilo, although some of them failed the tests and are commendted out for now. )
Another Edit: repellent wins!! unpack is the way to go...
_Another_ Another Edit: tye wins! Not building a list is even better... See tye's post. I have updated this post w/ the tye results.
_Another_^3 Edit: BrowserUK wins! Well, after the fix so it isn't so destructive of the testArray.
A variant of BrowserUK's approach that uses substr comes in second, but a long ways behind chop.
_Another_^4 Edit: BrowserUK wins again!
buk2 is by far the fastest, but fails a "0", "0" test, so buk2_len which is a bit slower is the real winner, still 10 times faster than the competition for some test sets.
For REALLY fast routines being benchmarked, copying args is clearly a very bad idea. However, other routines like tye0 to tye2 which were the fastest for a while don't benefit as much from removing the argument copies (see the _opt variants in the results).
_Another_^5 Edit: BrowserUK wins again (again)!
buk3 is by far the fastest and passes all the tests
In check if all certain chars are found in a sentence, tallulah asked for improvements on a routine trying to check whether a given sentence contains all of a set of specified characters.
Many monks suggested solutions, with way more than one way to do it.
Here are the results of some benchmarks on all of the suggested approaches on my Macbook Pro, using the provided 5.8.8 build.
I was quite surprised by the results, actually... I'd have bet good money that the slice approach was going to beat anything else, unless Tanktalus's List::MoreUtils approaches won. I promise, I was going to post this no wonder which implementation won :)
Where a solution was regex based, I also added a variant using "study", to see how that impacted performance.
If anyone wants to add a few test cases then run the tests, invoke the script with a -t argument.
Detailed results behind readmore, to fix Meditations page formatting:
Short
Rate tallulah_OriginalPost
tallulah_OriginalPost 105/s --
Tanktalus_AllRegex_Study 127/s 22%
Tanktalus_AllRegex 130/s 24%
varian_hash 146/s 40%
RMGir_slice 207/s 98%
moritz_BuildRegex_WithStudy 243/s 133%
moritz_BuildRegex 261/s 149%
ysth_loookahead 350/s 234%
Tanktalus_AllIndex 411/s 293%
unpack_allindex 444/s 324%
unpack_allrindex 448/s 328%
tassilo_listutils_r 581/s 455%
tye2 655/s 525%
tye1 661/s 531%
RMGir_index 667/s 537%
repellent_unpack 770/s 635%
repellent_unpack_opt 792/s 657%
tye2_opt 807/s 671%
tye1_opt 815/s 678%
tye0 829/s 692%
tye0_opt 830/s 693%
buk_substr 1152/s 1000%
buk 1829/s 1648%
buk2_len 4443/s 4144%
buk2 4610/s 4305%
buk3 5144/s 4814%
LongShort
Rate varian_hash
varian_hash 4.90/s --
RMGir_slice 5.88/s 20%
tallulah_OriginalPost 168/s 3319%
Tanktalus_AllRegex_Study 173/s 3422%
Tanktalus_AllRegex 209/s 4171%
moritz_BuildRegex_WithStudy 243/s 4867%
moritz_BuildRegex 396/s 7983%
ysth_loookahead 575/s 11640%
Tanktalus_AllIndex 731/s 14822%
unpack_allrindex 799/s 16201%
unpack_allindex 821/s 16650%
tassilo_listutils_r 914/s 18549%
RMGir_index 1017/s 20652%
tye2 1018/s 20671%
tye1 1046/s 21234%
repellent_unpack 1158/s 23518%
repellent_unpack_opt 1207/s 24515%
tye2_opt 1386/s 28180%
tye0_opt 1399/s 28442%
tye0 1414/s 28746%
tye1_opt 1440/s 29269%
buk_substr 1761/s 35815%
buk 2849/s 58012%
buk2_len 9489/s 193482%
buk2 9686/s 197486%
buk3 11005/s 224395%
ShortLong
Rate tallulah_OriginalPost
tallulah_OriginalPost 3.20/s --
Tanktalus_AllRegex 4.42/s 38%
Tanktalus_AllRegex_Study 4.42/s 38%
moritz_BuildRegex_WithStudy 5.36/s 67%
moritz_BuildRegex 5.41/s 69%
varian_hash 8.18/s 156%
ysth_loookahead 9.26/s 189%
RMGir_index 14.4/s 351%
tassilo_listutils_r 15.9/s 396%
RMGir_slice 15.9/s 396%
Tanktalus_AllIndex 16.0/s 401%
repellent_unpack_opt 16.8/s 425%
repellent_unpack 17.1/s 435%
unpack_allrindex 20.0/s 525%
unpack_allindex 20.2/s 531%
tye2_opt 25.5/s 696%
tye0_opt 25.7/s 703%
tye0 25.7/s 704%
tye1_opt 26.2/s 719%
tye2 26.9/s 739%
tye1 27.4/s 755%
buk_substr 64.0/s 1900%
buk 67.9/s 2021%
buk2_len 888/s 27649%
buk2 939/s 29247%
buk3 1109/s 34553%
LongLong
Rate tallulah_OriginalPost
tallulah_OriginalPost 3.20/s --
Tanktalus_AllRegex_Study 4.46/s 40%
Tanktalus_AllRegex 4.50/s 41%
varian_hash 4.59/s 43%
moritz_BuildRegex 5.41/s 69%
moritz_BuildRegex_WithStudy 5.41/s 69%
RMGir_slice 6.80/s 112%
ysth_loookahead 9.40/s 194%
RMGir_index 12.9/s 302%
repellent_unpack_opt 15.1/s 372%
repellent_unpack 15.6/s 387%
tassilo_listutils_r 16.0/s 399%
Tanktalus_AllIndex 16.0/s 401%
unpack_allindex 20.4/s 537%
unpack_allrindex 20.6/s 543%
tye1_opt 25.9/s 708%
tye0 26.0/s 711%
tye2_opt 26.0/s 711%
tye0_opt 26.2/s 719%
tye2 27.3/s 752%
tye1 27.3/s 752%
buk_substr 63.6/s 1889%
buk 67.3/s 2002%
buk2_len 852/s 26537%
buk2 896/s 27908%
buk3 1017/s 31690%
VeryLong
Rate varian_hash
varian_hash 8.41/s --
RMGir_slice 10.2/s 21%
tallulah_OriginalPost 189/s 2152%
Tanktalus_AllRegex_Study 205/s 2332%
Tanktalus_AllRegex 245/s 2810%
moritz_BuildRegex_WithStudy 346/s 4019%
moritz_BuildRegex 508/s 5942%
ysth_loookahead 800/s 9411%
Tanktalus_AllIndex 1090/s 12861%
unpack_allrindex 1184/s 13980%
tassilo_listutils_r 1207/s 14245%
unpack_allindex 1230/s 14521%
tye2_opt 1267/s 14963%
tye1_opt 1268/s 14974%
tye0 1279/s 15106%
tye2 1280/s 15118%
RMGir_index 1280/s 15118%
tye0_opt 1292/s 15264%
tye1 1297/s 15323%
repellent_unpack 1506/s 17801%
repellent_unpack_opt 1527/s 18058%
buk_substr 2634/s 31219%
buk 2830/s 33543%
buk2_len 24093/s 286343%
buk2 25048/s 297687%
buk3 28768/s 341917%
#!/usr/bin/perl -w use strict; use List::MoreUtils qw(all); use Benchmark qw(cmpthese); #use Test::More qw(no_plan); require Test::More; my @shortTestCases=( # sentence wantedChars result [ "abxcd zwe rrv", "0", 0 ], [ "0", "0", 1 ], [ "abxcd zwe rrv", "xxv", 1 ], [ "abxcd zwe rrv", "xxvq", 0 ], [ "abxcd zwe rrv", "", 1 ], [ "The quick brown fox jumps over the lazy dog", "abcdefghijklmnopqrstuvwxyz", 1 ], [ "The quick brown fox jumps over the lazy dog", "abcdefghijklmnopqrstuvwxyzT", 1 ], [ "The quick brown fox jumps over the lazy dog", "abcdefghijklmnopqrstuvwxyzTU", 0 ], [ "The quick brown fox jumps over the lazy dog", "a", 1 ], [ "The quick brown fox jumps over the lazy dog", "", 1 ], ); # Long sentence, short wantedChars my @longShortTestCases = ( [ "The quick brown fox jumps over the lazy dog" x 100, "", 1 ], [ "The quick brown fox jumps over the lazy dog" x 100, "a", 1 ], [ "The quick brown fox jumps over the lazy dog" x 100, "abcdefghijklmnopqrstuvwxyzT", 1 ], [ "The quick brown fox jumps over the lazy dog" x 100, "abcdefghijklmnopqrstuvwxyzTU", 0 ], ); # Short sentence, long wantedChars my @shortLongTestCases = ( [ "The quick brown fox jumps over the lazy dog", "abcdefghijklmnopqrstuvwxyzT"x100, 1 ], [ "The quick brown fox jumps over the lazy dog", "abcdefghijklmnopqrstuvwxyzTU"x100, 0 ], ); # Long sentence, long wantedChars my @longLongTestCases = ( [ "The quick brown fox jumps over the lazy dog" x 100, "abcdefghijklmnopqrstuvwxyzT"x100, 1 ], [ "The quick brown fox jumps over the lazy dog" x 100, "abcdefghijklmnopqrstuvwxyzTU"x100, 0 ], ); # VERY long test case my $alphabet=(join '' => 'a' .. 'z', 'A' .. 'Z'); my @veryLongTestCase = ( [ random(10000, $alphabet), $alphabet, 1 ], ); my @testCases = ( @shortTestCases, @shortLongTestCases, @longShortTestCases, @longLongTestCases, @veryLongTestCase ); sub random { my $num = shift; my $wantedChars = shift; my @chars = ('a' .. 'z', 'A' .. 'Z'); my $result; # make sure result will work do { $result = join '' => map $chars[rand @chars], 1 .. $num; } until Tanktalus_AllIndex($result, $wantedChars); return $result; } sub test_routine { my ($testFn, $testName)=@_; foreach(@testCases) { my ($sentence, $wantedLetters, $expectedResult) = @$_; ok (!!($testFn->($sentence, $wantedLetters)) == !!$expectedResu +lt); } } sub benchmark_routine { my ($testFn, $testName, $testCases)=@_; foreach(@$testCases) { my ($sentence, $wantedLetters, $expectedResult) = @$_; $testFn->($sentence, $wantedLetters) for 1..20; } } # [id://707122] sub tallulah_OriginalPost { my ($sentence, $wantedLetters)=@_; my $flag=0; my @a = split '',$wantedLetters; for( my $i=0; $i<$#a+1; $i++ ) { if($sentence !~ /$a[$i]/) { $flag=1;last; } } return !$flag; } # [id://707123] sub moritz_BuildRegex { my ($sentence, $wantedLetters)=@_; my $re = '^' . join '', map "(?=.*?$_)", map quotemeta, split m//, $wantedLetters; if ($sentence =~ m/$re/) { return 1; } return 0; } # [id://707123] sub moritz_BuildRegex_WithStudy { my ($sentence, $wantedLetters)=@_; my $re = '^' . join '', map "(?=.*?$_)", map quotemeta, split m//, $wantedLetters; study $sentence; if ($sentence =~ m/$re/) { return 1; } return 0; } # [id://707124] sub RMGir_index { my ($sentence, $wantedLetters)=@_; # don't need this variable (or any of them, in # fact -- they're just here for clarity. # we could work straight out of @_ if we wanted # this terser # Also, the $[ check is just pedantic - if someone # changes $[, shoot them. my $foundLetters=scalar (grep index($sentence,$_)>=$[, split //,$wantedLetters); return length($wantedLetters)==$foundLetters; } # [id://707222] sub Tanktalus_AllRegex { my ($sentence, $letters) = @_; return 1 unless length($letters); # all we're doing is checking for each letter. all { $sentence =~ $_ } split //, $letters; } # [id://707222] sub Tanktalus_AllRegex_Study { my ($sentence, $letters) = @_; return 1 unless length($letters); study $sentence; # all we're doing is checking for each letter. all { $sentence =~ $_ } split //, $letters; # same as above, but with index which I think is less readable. #all { index($sentence, $_) >= $[ } split //, $letters; } # [id://707222] sub Tanktalus_AllIndex { my ($sentence, $letters) = @_; return 1 unless length($letters); # same as above, but with index which I think is less readable. all { index($sentence, $_) >= $[ } split //, $letters; } # JavaFan's looks about equivalent to OP approach # Doesn't have same repeated letter semantics specified in # OP post. # [id://707176] sub oshalla_scan { my ($sentence, $wanted) = @_ ; while (length($wanted)) { return 0 if ($sentence !~ m/([$wanted])/g) ; $wanted =~ s/$1// ; } ; return 1; } # [id://707231] sub varian_hash { my ($sentence, $wantedLetters)=@_; my %required = map {$_ => 1} split //,$wantedLetters; map delete $required{$_}, split //, $sentence; if (keys %required) { return 0; } else { return 1; } } # [id://707314] sub RMGir_slice { my ($sentence, $wantedLetters)=@_; my %required; @required{split //,$wantedLetters}=(); delete @required{split //, $sentence}; if (keys %required) { return 0; } else { return 1; } } # [tassilo]'s test cases sub makeSubRef { return @_[1,0]; } my %testSubroutines=( makeSubRef(\&tallulah_OriginalPost, "tallulah_OriginalPost"), makeSubRef(\&moritz_BuildRegex, "moritz_BuildRegex"), makeSubRef(\&moritz_BuildRegex_WithStudy, "moritz_BuildRegex_WithStudy"), makeSubRef(\&RMGir_index, "RMGir_index"), makeSubRef(\&Tanktalus_AllRegex, "Tanktalus_AllRegex"), makeSubRef(\&Tanktalus_AllRegex_Study, "Tanktalus_AllRegex_Study"), makeSubRef(\&Tanktalus_AllIndex, "Tanktalus_AllIndex"), makeSubRef(\&varian_hash, "varian_hash"), makeSubRef(\&RMGir_slice, "RMGir_slice"), tassilo_listutils_r => sub { return 1 unless length($_[1]); all { rindex($_[0], $_) >= 0 } split //, $_[1]; }, repellent_unpack => sub { my ($sentence, $wantedLetters)=@_; my $foundLetters=scalar (grep index($sentence,$_)>=$[, unpack "(a)*", $wantedLetters); return length($wantedLetters)==$foundLetters; }, repellent_unpack_opt => sub { length($_[1])==scalar (grep index($_[0],$_)>=$[, unpack "(a)*", $_[1]); }, unpack_allindex => sub { return 1 unless length($_[1]); my ($sentence, $wantedLetters)=@_; all { index($sentence, $_) >= $[ } unpack "(a)*", $wantedLetters; }, unpack_allrindex => sub { return 1 unless length($_[1]); my ($sentence, $wantedLetters)=@_; all { rindex($sentence, $_) >= $[ } unpack "(a)*", $wantedLetters; }, tye2_opt => sub { while( $_[1] =~ /(.)/gs ) { return 0 if -1 == index($_[0],$1); } return 1; }, tye1_opt => sub { -1 == index($_[0],$1) && return 0 while( $_[1] =~ /(.)/gs ); return 1; }, tye0_opt => sub { -1 == index( $_[0], $1 ) && return 0 while( $_[1] =~ /(.)/gs ); return 1; }, tye2 => sub { my( $sentence, $wantedLetters )= @_; while( $wantedLetters =~ /(.)/gs ) { return 0 if -1 == index($sentence,$1); } return 1; }, tye1 => sub { my( $sentence, $wantedLetters )= @_; -1 == index($sentence,$1) && return 0 while( $wantedLetters =~ /(.)/gs ); return 1; }, tye0 => sub { -1 == index( $_[0], $1 ) && return 0 while( $_[1] =~ /(.)/gs ); return 1; }, # FAILS the "0" "0" test.... buk => sub { my( $s, $w ) = @_; my $c; 1+index $s, $c or return 0 while $c = chop $w; 1; }, buk_substr => sub { my( $s, $w ) = @_; 1+index $s, substr($w,$_,1) or return 0 foreach 0..length($w); 1; }, # FAILS the "0" "0" test.... buk2 => sub { local $_; 1+index $_[0], $_ or return while $_ = chop $_[1]; 1; }, buk2_len => sub { local $_; 1+index $_[0], $_ or return while length($_ = chop $_[1]); 1; }, buk3 => sub { 1+index $_[0], chop $_[ 1 ] or return for 1 .. length $_[ 1 ]; 1; }, buk4 => sub { ( -1 != index $_[0], chop $_[ 1 ] ) || return for 1 .. length +$_[ 1 ]; 1; }, ysth_loookahead => sub { my ($sentence, $wantedLetters) = @_; $wantedLetters =~ s/(.)/(?=.*?\Q$1\E)/sg; $sentence =~ /^$wantedLetters/s; } # # These 2 routines need more debugging - they fail the tests # tassilo_xor => sub { # return 1 unless length($_[1]); # my $copy = $_[0]; # for (split //, $_[1]) { # my $mask = ($_) x length $copy; # $copy ^= $mask; # $copy =~ tr/\000//d; # $copy ^= ( ($_) x length $copy ); # }; # length($copy) == 0; # }, # tassilo_tr => sub { # return 1 unless length($_[1]); # my $copy = $_[0]; # eval "\$copy =~ tr/$_[1]//d"; # length($copy) == 0; # } ); if(@ARGV && $ARGV[0] eq "-t") { use Test::More; plan tests => ((scalar keys %testSubroutines) * scalar @testCases) +; print "Testing routines...\n"; foreach my $name(sort keys %testSubroutines) { print "Testing $name\n"; test_routine($testSubroutines{$name}, $name); } exit(0); } print "Running benchmarks...\n"; my $testsRef = \@shortTestCases; my $benchmark_routines={ map { ($_, eval qq[sub { benchmark_routine(\$testSubroutines{"$_"}, "$_", \$testsRef); } ]) } keys %testSubroutines }; print "Short \n"; cmpthese(-1, $benchmark_routines); $testsRef = \@longShortTestCases; print "\n\n"; print "LongShort \n"; cmpthese(-1, $benchmark_routines); $testsRef = \@shortLongTestCases; print "\n\n"; print "ShortLong \n"; cmpthese(-1, $benchmark_routines); $testsRef = \@longLongTestCases; print "\n\n"; print "LongLong \n"; cmpthese(-1, $benchmark_routines); $testsRef = \@veryLongTestCase; print "\n\n"; print "VeryLong \n"; cmpthese(-1, $benchmark_routines);
|
|---|