Perl is doing much more (but smarter) work when it finds a R in the string:

use strict; use warnings; use Benchmark qw/ cmpthese /; my $R = 'R' x 10; my $Q = 'Q' x 10; use re 'debug'; sub foo { my $c = shift; pos $$c = 0; while ( 1 ) { next if $$c =~ /\G\d+R/; last unless $$c =~ /\w/g; } } print "-" x 80, "\n"; foo( \$Q ); print "-" x 80, "\n"; foo( \$R ); print "-" x 80, "\n";
Matching REx "\G\d+R" against "QQQQQQQQQQ" Intuit: trying to determine minimum start position... Did not find floating substr "R"... Match rejected by optimizer Matching REx "\w" against "QQQQQQQQQQ" Matching stclass POSIXD[\w] against "QQQQQQQQQQ" (10 bytes) 0 <> <QQQQQQQQQQ> | 1:POSIXD[\w](2) 1 <Q> <QQQQQQQQQ> | 2:END(0) Match successful! ... Matching REx "\G\d+R" against "QQQQQQQQQQ" Regex match can't succeed, so not even tried Matching REx "\w" against "Q" Matching stclass POSIXD[\w] against "Q" (1 bytes) 9 <QQQQQQQQQ> <Q> | 1:POSIXD[\w](2) 10 <QQQQQQQQQQ> <> | 2:END(0) Match successful! Matching REx "\G\d+R" against "QQQQQQQQQQ" Regex match can't succeed, so not even tried Matching REx "\w" against "" Regex match can't succeed, so not even tried ---------------------------------------------------- ---------------------------------------------------------------------- +---------- Matching REx "\G\d+R" against "RRRRRRRRRR" Intuit: trying to determine minimum start position... Found floating substr "R" at offset 1... (multiline anchor test skipped) looking for class: start_shift: 1 check_at: 1 rx_origin: 0 endpos: 1 This position contradicts STCLASS... Match rejected by optimizer Matching REx "\w" against "RRRRRRRRRR" Matching stclass POSIXD[\w] against "RRRRRRRRRR" (10 bytes) 0 <> <RRRRRRRRRR> | 1:POSIXD[\w](2) 1 <R> <RRRRRRRRR> | 2:END(0) Match successful! Matching REx "\G\d+R" against "RRRRRRRRRR" Intuit: trying to determine minimum start position... Found floating substr "R" at offset 1... (multiline anchor test skipped) looking for class: start_shift: 1 check_at: 2 rx_origin: 1 endpos: 2 This position contradicts STCLASS... Match rejected by optimizer Matching REx "\w" against "RRRRRRRRR" Matching stclass POSIXD[\w] against "RRRRRRRRR" (9 bytes) 1 <R> <RRRRRRRRR> | 1:POSIXD[\w](2) 2 <RR> <RRRRRRRR> | 2:END(0) Match successful! ...

I'm not sure how you could make Perl try not to scan the string for the fixed substring R but immediately look for \G\d to immediately reject the parse at that location, no matter what follows.

By splitting up the parsing between \d+ and R, I can make both cases behave slightly 18% worse than \G\d+R, but I'm not sure if that's correct:

use strict; use warnings; use Benchmark qw/ cmpthese /; my $R = 'R' x 42_000; my $Q = 'Q' x 42_000; sub foo { my $c = shift; pos $$c = 0; while ( 1 ) { next if $$c =~ /\G\d+R/; last unless $$c =~ /\w/g; } } sub foo_twostep { my $c = shift; pos $$c = 0; while ( 1 ) { next if ($$c =~ /\G[0-9]+/ and $$c =~ /\GR/); last unless $$c =~ /\w/g; } } cmpthese -3, { R => sub { foo( \$R )}, Q => sub { foo( \$Q )}, R_twostep => sub { foo_twostep( \$R )}, Q_twostep => sub { foo_twostep( \$Q )}, } __END__ Rate Q R_twostep Q_twostep R Q 2.01/s -- -97% -97% -97% R_twostep 65.3/s 3147% -- 0% -15% Q_twostep 65.3/s 3147% 0% -- -15% R 77.1/s 3732% 18% 18% --

Update: Using only ASCII digits, I can get a slowdown of 12%:

use strict; use warnings; use Benchmark qw/ cmpthese /; my $R = 'R' x 42_000; my $Q = 'Q' x 42_000; sub foo { my $c = shift; pos $$c = 0; while ( 1 ) { next if $$c =~ /\G\d+R/; last unless $$c =~ /\w/g; } } sub foo_twostep { my $c = shift; pos $$c = 0; while ( 1 ) { next if ($$c =~ /\G\d+/ and $$c =~ /\GR/); last unless $$c =~ /\w/g; } } sub foo_asciidigits { my $c = shift; pos $$c = 0; while ( 1 ) { next if ($$c =~ /\G[0-9]/ and $$c =~ /\GR/); last unless $$c =~ /\w/g; } } cmpthese -3, { R => sub { foo( \$R )}, Q => sub { foo( \$Q )}, R_twostep => sub { foo_twostep( \$R )}, Q_twostep => sub { foo_twostep( \$Q )}, R_ascii => sub { foo_asciidigits( \$R )}, Q_ascii => sub { foo_asciidigits( \$Q )}, } __END__ Rate Q Q_twostep R_twostep R_ascii Q_ascii + R Q 1.99/s -- -97% -97% -97% -97% + -97% Q_twostep 62.9/s 3054% -- -1% -7% -8% + -17% R_twostep 63.8/s 3101% 1% -- -5% -6% + -16% R_ascii 67.5/s 3287% 7% 6% -- -1% + -11% Q_ascii 68.1/s 3319% 8% 7% 1% -- + -10% R 76.0/s 3715% 21% 19% 13% 12% + --

In reply to Re: Why is Perl suddenly slow in THIS case? by Corion
in thread Why is Perl suddenly slow in THIS case? by vr

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.