comment on

Before you start looking for areas that you can optimize you should look for areas that should be optimized. You can do this using Benchmark and splitting the function into working pieces, and then testing those pieces.

This line strikes me as possibly inefficient because it creates a new regex for each iteration, which means 1000 regex for a 1000 character sequence.

     ($left,$middle,$right) = $sequence =~ m/(\w{$_})?(\w{1})?(\w*)/;
[download]

So I wrote a benchmark to try a few strategies:

use strict;
use warnings;
use Benchmark qw(cmpthese);
my $iterations    = shift @ARGV || 5000;
my $sequence_size = shift @ARGV || 10;

# Simplistic random sequence
my $sequence = join '', map {chr(65 + rand(26))} 1..$sequence_size;

cmpthese($iterations,{
    re_orig => \&re_orig,
    re_set  => \&re_set,
    bsubstr => \&bsubstr,
});
# Test using the original regex
sub re_orig {
    for my $i (0..length($sequence)-1) {
        my ($left,$middle,$right) = $sequence =~ m/(\w{$i})?(\w{1})?(\
+w*)/;
    }
}
# Test using the character set [A-Z]
sub re_set {
    for my $i (0..length($sequence)-1) {
        my ($left,$middle,$right) = $sequence =~ m/([A-Z]{$i})([A-Z])(
+[A-Z]*)/;
    }
}

# Test using substr
sub bsubstr {
    for my $i (0..length($sequence)-1) {
        my ($left)   = substr($sequence,0,$i);
        my ($middle) = substr($sequence,$i,1);
        my ($right)  = substr($sequence,$i+1);
    }
}
[download]

Results for running 500 iterations with length 7:

          Rate re_orig  re_set bsubstr
re_orig  681/s      --    -12%    -92%
re_set   774/s     14%      --    -91%
bsubstr 8475/s   1144%    995%      --

Results for running 500 iterations with length 100:

          Rate re_orig  re_set bsubstr
re_orig 48.1/s      --    -12%    -93%
re_set  54.9/s     14%      --    -92%
bsubstr  718/s   1392%   1209%      --

Results for running 100 iterations with length 1000:

          Rate  re_set bsubstr
re_set  4.00/s      --    -93%
bsubstr 59.5/s   1387%      --

In reply to Re^3: Equation - code review by imp
in thread Equation - code review by kulls

Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!

Titles consisting of a single word are discouraged, and in most cases are disallowed outright.

Read Where should I post X? if you're not absolutely sure you're posting in the right place.

Please read these before you post! —

Posts may use any of the Perl Monks Approved HTML tags:

a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr

You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)

	For:		Use:
	&		`&`
	<		`<`
	>		`>`
	[		`[`
	]		`]`

Link using PerlMonks shortcuts! What shortcuts can I use for linking?

See Writeup Formatting Tips and other pages linked from there for more info.