If you follow perl5-porters, you'll know this already, but I think this deserves all the exposure it can.

Long-time perl users might be aware that perl's regular expression engine is a recursive implementation. This usually work ok, except on rather pathological cases where so much recursion takes place that the C stack runs out of space and the process dies with a segmentation fault.

perl -le '$x = "o" x shift; $x =~ s/(o?)+//g' 10000 # Bus error: 10 (core dumped)

(if the above doesn't crash for you, just up the 10000 a bit).

Just last week, dave_the_m checked a patch into the perl source code repository that converts the algorithm used from recursive to iterative. Since then steve_p has been busily going through the bug database, closing out dozens of bugs that have been corrected due to this change. All in all it's a stunning piece of work, and Dave deserves to be congratulated for it.

The code is now in blead. Once it's settled a while, it may be backported to the maintenance track, and it will start to appear in the 5.8 series. Otherwise you'll have to wait until 5.10.

Read Dave's announcement.

• another intruder with the mooring in the heart of the Perl

  • Comment on The perl regular expression engine is now iterative rather than recursive
  • Download Code

Replies are listed 'Best First'.
Re: The perl regular expression engine is now iterative rather than recursive
by samtregar (Abbot) on Mar 30, 2006 at 18:32 UTC
    I was excited about this until I read the announcement and looked at the example code at the bottom! That's a whole lot of pain to address some hopeless regular expressions. Talk about spaghetti code...

    -sam

      Well, the announcement seems to imply that this is just the first iteration at getting a good iterative solution. ;-)

      Hopefully the next iteration results in something a little less spaghetti-like, and maybe something more like rotini with all the appropriate spirals and ... mmmm, I think it's lunch time.