in reply to regex gotcha moving from 5.8.8 to 5.30.0?

So I created a 50k.txt file and ran the the function for all of the versions of perl that I have (I have one of each major version installed).

For 5.8, 5.10, 5.12, 5.14, 5.16, and 5.18, it took around 11s to run. Starting at 5.20 and on, it took 30s to run. I've been looking through the perldelta doc for 5.20 at the regexp related stuff, and there are several changes, but I'm not seeing one that jumps out as related to this specifically... but it's at least a place to start. The 5.20 version is definitely where the jump happened.

FWIW, I was using the last 5.20.3 release... I did not test earlier 5.20 releases, so I have not narrowed it down to the specific 5.20 release.

  • Comment on Re: regex gotcha moving from 5.8.8 to 5.30.0?

Replies are listed 'Best First'.
Re^2: regex gotcha moving from 5.8.8 to 5.30.0?
by swl (Prior) on Feb 10, 2021 at 21:37 UTC

    Just a guess, but the delta for 5.20 includes this entry:

    Executing a regex that contains the ^ anchor (or its variant under the /m flag) has been made much faster in several situations.

    https://metacpan.org/pod/release/RJBS/perl-5.20.0/pod/perldelta.pod#Performance-Enhancements

    Maybe that enhancement has some side effects triggered by the \s* ^ \s* patterns in the regexps.

    Are you able to test what happens in pre-5.20 if these patterns are changed to \s*?

    Edit - I should have asked for what happens either side of 5.20.