comment on

And here is a benchmark for the 3 solutions.

It seems that using the regular expression natively is very fast. This is followed by the regular exprssion and OR logic. The split is slowest.

use Benchmark qw (cmpthese);

@data = ('foo 1000 bar 1000', 'foo 1000 bar 500',
         'foo 500 bar 1000', 'foo 500 bar 500',
         'foo 1 bar 1',
         'foo 1 bar 2',
         'foo 2 bar 2',
         'foo 10000 bar 1000',
         'foo 10000 bar 10000',
         );

my $tot = 5;

my $count = 100000;

cmpthese($count, {
        'ref' => sub {
                my $x=0;
                for (@data) {
                        $x++ if /^\S+ (\d+) \S+ \1$/;
                }
                die "$x" unless $x == $tot;
        },
        'split' => sub {
                my $x=0;
                for (@data) {
                        my @arr = split /\s+/;
                        $x++ if $arr[1] == $arr[3];
                }
                die $x unless $x == $tot;
        },
        'simple' => sub {
                my $x=0;
                for (@data) {
                        $x++ if !m/foo (\d+) bar (\d+)/ || $1 == $2;
                }
                die unless $x == $tot;
        },
});

__END__
[download]

           Rate   split  simple backref
split    4329/s      --    -31%    -66%
simple   6313/s     46%      --    -50%
backref 12658/s    192%    101%      --
[download]

UPDATE: As Sam points out I had the effeciency of the different techniques completely backwards. I also noticed a bug in the simple technique. Fixxing that speed it up by 50%.

-- gam3
A picture is worth a thousand words, but takes 200K.

In reply to Re: regex search valid only if registers n and n+1 are equal? by gam3
in thread regex search valid only if registers n and n+1 are equal? by Voronich

Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!

Titles consisting of a single word are discouraged, and in most cases are disallowed outright.

Read Where should I post X? if you're not absolutely sure you're posting in the right place.

Please read these before you post! —

Posts may use any of the Perl Monks Approved HTML tags:

a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr

You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)

	For:		Use:
	&		`&`
	<		`<`
	>		`>`
	[		`[`
	]		`]`

Link using PerlMonks shortcuts! What shortcuts can I use for linking?

See Writeup Formatting Tips and other pages linked from there for more info.