How about “apples and oranges” or “it’s a really worthless benchmark”?
#!/usr/bin/perl use strict; use warnings; use Benchmark qw( cmpthese ); sub run_tests { my ( $len_remove, $len_keep, $num_repeat ) = @_; my $remove = 'N' x $len_remove; my $keep = 'O' x $len_keep; my %test = ( front => "$remove$keep" x $num_repeat, tail => "$keep$remove" x $num_repeat, both_ends => "$remove$keep" x $num_repeat . $remove, nothing => "$keep$remove" x $num_repeat . $keep, ); print "$len_remove chars to remove, $len_keep chars long kept sequ +ences, $num_repeat repetitions.\n"; for my $type ( keys %test ) { print "Measuring removing at $type.\n"; cmpthese -2 => { one_sub => sub { for( 1 .. 1000 ) { s{^N*(.*?)N*$}{$1} for + my $copy = $test{$type} } }, two_sub => sub { for( 1 .. 1000 ) { s{^N*}{}, s{N*$}{} for + my $copy = $test{$type} } }, }; } print "\n"; } $|++; run_tests 4, 4, 1; run_tests 20, 20, 1; run_tests 20, 20, 50; run_tests 4, 4, 20; run_tests 4, 12, 10; run_tests 4, 100, 100;
This gives me:
4 chars to remove, 4 chars long kept sequences, 1 repetitions.
Measuring removing at front.
Rate one_sub two_sub
one_sub 125/s -- -53%
two_sub 269/s 115% --
Measuring removing at tail.
Rate one_sub two_sub
one_sub 126/s -- -54%
two_sub 276/s 120% --
Measuring removing at nothing.
Rate one_sub two_sub
one_sub 102/s -- -49%
two_sub 201/s 98% --
Measuring removing at both_ends.
Rate one_sub two_sub
one_sub 122/s -- -54%
two_sub 266/s 118% --
20 chars to remove, 20 chars long kept sequences, 1 repetitions.
Measuring removing at front.
Rate one_sub two_sub
one_sub 85.8/s -- -48%
two_sub 165/s 92% --
Measuring removing at tail.
Rate one_sub two_sub
one_sub 85.8/s -- -48%
two_sub 165/s 93% --
Measuring removing at nothing.
Rate one_sub two_sub
one_sub 48.8/s -- -40%
two_sub 80.8/s 65% --
Measuring removing at both_ends.
Rate one_sub two_sub
one_sub 85.0/s -- -48%
two_sub 162/s 91% --
20 chars to remove, 20 chars long kept sequences, 50 repetitions.
Measuring removing at front.
Rate one_sub two_sub
one_sub 2.20/s -- -29%
two_sub 3.10/s 41% --
Measuring removing at tail.
Rate one_sub two_sub
one_sub 2.16/s -- -29%
two_sub 3.03/s 40% --
Measuring removing at nothing.
Rate one_sub two_sub
one_sub 2.15/s -- -29%
two_sub 3.04/s 42% --
Measuring removing at both_ends.
Rate one_sub two_sub
one_sub 2.16/s -- -27%
two_sub 2.99/s 38% --
4 chars to remove, 4 chars long kept sequences, 20 repetitions.
Measuring removing at front.
Rate one_sub two_sub
one_sub 23.3/s -- -36%
two_sub 36.6/s 57% --
Measuring removing at tail.
Rate one_sub two_sub
one_sub 23.8/s -- -35%
two_sub 36.7/s 54% --
Measuring removing at nothing.
Rate one_sub two_sub
one_sub 22.9/s -- -35%
two_sub 35.1/s 53% --
Measuring removing at both_ends.
Rate one_sub two_sub
one_sub 23.3/s -- -37%
two_sub 36.7/s 58% --
4 chars to remove, 12 chars long kept sequences, 10 repetitions.
Measuring removing at front.
Rate one_sub two_sub
one_sub 24.2/s -- -35%
two_sub 37.3/s 54% --
Measuring removing at tail.
Rate one_sub two_sub
one_sub 23.8/s -- -37%
two_sub 37.9/s 59% --
Measuring removing at nothing.
Rate one_sub two_sub
one_sub 22.7/s -- -35%
two_sub 34.9/s 54% --
Measuring removing at both_ends.
Rate one_sub two_sub
one_sub 24.2/s -- -36%
two_sub 37.9/s 57% --
4 chars to remove, 100 chars long kept sequences, 100 repetitions.
Measuring removing at front.
(warning: too few iterations for a reliable count)
(warning: too few iterations for a reliable count)
s/iter one_sub two_sub
one_sub 2.14 -- -30%
two_sub 1.50 43% --
Measuring removing at tail.
(warning: too few iterations for a reliable count)
(warning: too few iterations for a reliable count)
s/iter one_sub two_sub
one_sub 2.18 -- -31%
two_sub 1.50 45% --
Measuring removing at nothing.
(warning: too few iterations for a reliable count)
(warning: too few iterations for a reliable count)
s/iter one_sub two_sub
one_sub 2.20 -- -31%
two_sub 1.52 45% --
Measuring removing at both_ends.
(warning: too few iterations for a reliable count)
(warning: too few iterations for a reliable count)
s/iter one_sub two_sub
one_sub 2.19 -- -31%
two_sub 1.50 46% --
As you can see, the two-subst version is always faster. If you don’t believe me, run the thing through use re 'debug'; and watch what the engine is doing.
Makeshifts last the longest.
In reply to Re^4: Removing Flanking "N"s in a DNA String
by Aristotle
in thread Removing Flanking "N"s in a DNA String
by monkfan
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |