Somehow I thought that the possessive quantifiers in perl 5.10 could only boost up performance, and never degrade it (neglecting a bit of overhead, perhaps). It seems that I was wrong:
#!/usr/bin/perl use strict; use warnings; use 5.010; use Benchmark qw(cmpthese); my $str = "bea" x 100; my $re = qr/(?:be|ea|a)/; cmpthese(-2, { atomic1 => sub { die if $str =~ m/(?>$re+)\d/ }, atomic2 => sub { die if $str =~ m/(?>$re)+\d/ }, normal => sub { die if $str =~ m/$re+\d/ }, posessive => sub { die if $str =~ m/$re++\d/ }, }); __END__ Rate posessive atomic1 atomic2 normal posessive 93.3/s -- 0% -96% -97% atomic1 93.3/s 0% -- -96% -97% atomic2 2545/s 2628% 2628% -- -8% normal 2764/s 2862% 2862% 9% --
You can see that the normal quantifier is the fastest, more than 20 times faster than the possessive quantifier.

But why?

I tried to decipher the -Mre=debug output, and it seems that, with a normal quantifier, the regex engine does some backtracking, so it can't be a magic optimization that immediately proves that the match fails.

Or is my benchmark flawed in some way?

Update 1: I talked a bit on IRC with demerphq, and as he wrote the problem seems to be some caching problem.

So far it's not clear if 5.8.8 is caching too eagerly, or 5.10.0 too little.

It's also clear that the pattern isn't very useful, in most cases such a pattern would be anchored, in which case it's blazingly fast ;-)

I also informed our formidable porters.


In reply to Performance of possessive quantifiers by moritz

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.