comment on

How about “apples and oranges” or “it’s a really worthless benchmark”?

#!/usr/bin/perl
use strict;
use warnings;

use Benchmark qw( cmpthese );

sub run_tests {
    my ( $len_remove, $len_keep, $num_repeat ) = @_;

    my $remove = 'N' x $len_remove;
    my $keep = 'O' x $len_keep;

    my %test = (
        front     => "$remove$keep" x $num_repeat,
        tail      => "$keep$remove" x $num_repeat,
        both_ends => "$remove$keep" x $num_repeat . $remove,
        nothing   => "$keep$remove" x $num_repeat . $keep,
    );

    print "$len_remove chars to remove, $len_keep chars long kept sequ
+ences, $num_repeat repetitions.\n";
    for my $type ( keys %test ) {
        print "Measuring removing at $type.\n";
        cmpthese -2 => {
            one_sub => sub { for( 1 .. 1000 ) { s{^N*(.*?)N*$}{$1} for
+ my $copy = $test{$type} } },
            two_sub => sub { for( 1 .. 1000 ) { s{^N*}{}, s{N*$}{} for
+ my $copy = $test{$type} } },
        };
    }

    print "\n";
}

$|++;

run_tests 4, 4, 1;
run_tests 20, 20, 1;
run_tests 20, 20, 50;
run_tests 4, 4, 20;
run_tests 4, 12, 10;
run_tests 4, 100, 100;
[download]

This gives me:

4 chars to remove, 4 chars long kept sequences, 1 repetitions.
Measuring removing at front.
         Rate one_sub two_sub
one_sub 125/s      --    -53%
two_sub 269/s    115%      --
Measuring removing at tail.
         Rate one_sub two_sub
one_sub 126/s      --    -54%
two_sub 276/s    120%      --
Measuring removing at nothing.
         Rate one_sub two_sub
one_sub 102/s      --    -49%
two_sub 201/s     98%      --
Measuring removing at both_ends.
         Rate one_sub two_sub
one_sub 122/s      --    -54%
two_sub 266/s    118%      --

20 chars to remove, 20 chars long kept sequences, 1 repetitions.
Measuring removing at front.
          Rate one_sub two_sub
one_sub 85.8/s      --    -48%
two_sub  165/s     92%      --
Measuring removing at tail.
          Rate one_sub two_sub
one_sub 85.8/s      --    -48%
two_sub  165/s     93%      --
Measuring removing at nothing.
          Rate one_sub two_sub
one_sub 48.8/s      --    -40%
two_sub 80.8/s     65%      --
Measuring removing at both_ends.
          Rate one_sub two_sub
one_sub 85.0/s      --    -48%
two_sub  162/s     91%      --

20 chars to remove, 20 chars long kept sequences, 50 repetitions.
Measuring removing at front.
          Rate one_sub two_sub
one_sub 2.20/s      --    -29%
two_sub 3.10/s     41%      --
Measuring removing at tail.
          Rate one_sub two_sub
one_sub 2.16/s      --    -29%
two_sub 3.03/s     40%      --
Measuring removing at nothing.
          Rate one_sub two_sub
one_sub 2.15/s      --    -29%
two_sub 3.04/s     42%      --
Measuring removing at both_ends.
          Rate one_sub two_sub
one_sub 2.16/s      --    -27%
two_sub 2.99/s     38%      --

4 chars to remove, 4 chars long kept sequences, 20 repetitions.
Measuring removing at front.
          Rate one_sub two_sub
one_sub 23.3/s      --    -36%
two_sub 36.6/s     57%      --
Measuring removing at tail.
          Rate one_sub two_sub
one_sub 23.8/s      --    -35%
two_sub 36.7/s     54%      --
Measuring removing at nothing.
          Rate one_sub two_sub
one_sub 22.9/s      --    -35%
two_sub 35.1/s     53%      --
Measuring removing at both_ends.
          Rate one_sub two_sub
one_sub 23.3/s      --    -37%
two_sub 36.7/s     58%      --

4 chars to remove, 12 chars long kept sequences, 10 repetitions.
Measuring removing at front.
          Rate one_sub two_sub
one_sub 24.2/s      --    -35%
two_sub 37.3/s     54%      --
Measuring removing at tail.
          Rate one_sub two_sub
one_sub 23.8/s      --    -37%
two_sub 37.9/s     59%      --
Measuring removing at nothing.
          Rate one_sub two_sub
one_sub 22.7/s      --    -35%
two_sub 34.9/s     54%      --
Measuring removing at both_ends.
          Rate one_sub two_sub
one_sub 24.2/s      --    -36%
two_sub 37.9/s     57%      --

4 chars to remove, 100 chars long kept sequences, 100 repetitions.
Measuring removing at front.
            (warning: too few iterations for a reliable count)
            (warning: too few iterations for a reliable count)
        s/iter one_sub two_sub
one_sub   2.14      --    -30%
two_sub   1.50     43%      --
Measuring removing at tail.
            (warning: too few iterations for a reliable count)
            (warning: too few iterations for a reliable count)
        s/iter one_sub two_sub
one_sub   2.18      --    -31%
two_sub   1.50     45%      --
Measuring removing at nothing.
            (warning: too few iterations for a reliable count)
            (warning: too few iterations for a reliable count)
        s/iter one_sub two_sub
one_sub   2.20      --    -31%
two_sub   1.52     45%      --
Measuring removing at both_ends.
            (warning: too few iterations for a reliable count)
            (warning: too few iterations for a reliable count)
        s/iter one_sub two_sub
one_sub   2.19      --    -31%
two_sub   1.50     46%      --

As you can see, the two-subst version is always faster. If you don’t believe me, run the thing through use re 'debug'; and watch what the engine is doing.

Makeshifts last the longest.

In reply to Re^4: Removing Flanking "N"s in a DNA String by Aristotle
in thread Removing Flanking "N"s in a DNA String by monkfan

Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!

Titles consisting of a single word are discouraged, and in most cases are disallowed outright.

Read Where should I post X? if you're not absolutely sure you're posting in the right place.

Please read these before you post! —

Posts may use any of the Perl Monks Approved HTML tags:

a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr

You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)

	For:		Use:
	&		`&`
	<		`<`
	>		`>`
	[		`[`
	]		`]`

Link using PerlMonks shortcuts! What shortcuts can I use for linking?

See Writeup Formatting Tips and other pages linked from there for more info.