comment on

Somehow I thought that the possessive quantifiers in perl 5.10 could only boost up performance, and never degrade it (neglecting a bit of overhead, perhaps). It seems that I was wrong:

#!/usr/bin/perl
use strict;
use warnings;
use 5.010;
use Benchmark qw(cmpthese);

my $str = "bea" x 100;
my $re = qr/(?:be|ea|a)/;
cmpthese(-2, {
    atomic1   => sub { die if $str =~ m/(?>$re+)\d/ },
    atomic2   => sub { die if $str =~ m/(?>$re)+\d/ },
    normal    => sub { die if $str =~ m/$re+\d/ },
    posessive => sub { die if $str =~ m/$re++\d/ },
});
__END__
            Rate posessive   atomic1   atomic2    normal
posessive 93.3/s        --        0%      -96%      -97%
atomic1   93.3/s        0%        --      -96%      -97%
atomic2   2545/s     2628%     2628%        --       -8%
normal    2764/s     2862%     2862%        9%        --
[download]

You can see that the normal quantifier is the fastest, more than 20 times faster than the possessive quantifier.

But why?

I tried to decipher the -Mre=debug output, and it seems that, with a normal quantifier, the regex engine does some backtracking, so it can't be a magic optimization that immediately proves that the match fails.

Or is my benchmark flawed in some way?

Update 1: I talked a bit on IRC with demerphq, and as he wrote the problem seems to be some caching problem.

So far it's not clear if 5.8.8 is caching too eagerly, or 5.10.0 too little.

It's also clear that the pattern isn't very useful, in most cases such a pattern would be anchored, in which case it's blazingly fast ;-)

I also informed our formidable porters.

In reply to Performance of possessive quantifiers by moritz

Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!

Titles consisting of a single word are discouraged, and in most cases are disallowed outright.

Read Where should I post X? if you're not absolutely sure you're posting in the right place.

Please read these before you post! —

Posts may use any of the Perl Monks Approved HTML tags:

a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr

You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)

	For:		Use:
	&		`&`
	<		`<`
	>		`>`
	[		`[`
	]		`]`

Link using PerlMonks shortcuts! What shortcuts can I use for linking?

See Writeup Formatting Tips and other pages linked from there for more info.