in reply to Interesting behavior of regular expression engine

Because the first one is triggering an optimisation. If you do '.+b', you might expect the engine to naively just keep skipping non-newline chars (.) until it can't go any further, then fail to match a 'b', backtrack one position, still fail the 'b', then continue backtracking until only one '.' is consumed, and the following 'b' is matched.

Instead, the code that does '.+' is optimised to know that it must be followed by a known fixed string (bcdef) and avoids consuming so many characters that the constraint can't be met.

In the second case it isn't followed by a fixed string, but rather by a '.', so the optimisation isn't triggered, and the engine falls back to a naive series of backtracks.

Dave.

  • Comment on Re: Interesting behavior of regular expression engine

Replies are listed 'Best First'.
Re^2: Interesting behavior of regular expression engine
by lightoverhead (Pilgrim) on Mar 12, 2013 at 23:29 UTC
    Thank you Dave. Your answer is very clear!