Re^3: Perl regexp matching is slow??

It certainly sounds like distinguishing | and || is a good idea. I do not claim to fully understand everything in Synopsis 5, but the general idea seems like being on the right track.

One interesting gotcha (I can't tell whether it is handled correctly or not). Consider the Perl 5 regular expression (.*)\1 run on (("a" x $n) . "b") x 2. It should match. However, figuring that out requires trying many possible matches for the (.*) part: the first try would be the whole string, then the whole string minus one char, etc., until you got down to just half the string.

It is not clear to me that the longest-token semantics would get this right: it sounds like they would grab the whole string during (.*), fail to match in the equivalent of \1, and then give up.

All this is just to say that backreferences are even harder than they first appear, not that the longest-token idea is flawed.

Comment on Re^3: Perl regexp matching is slow?? Select or Download Code