Quick conjecture time (like usual)...
The greedy operators are optimized. They figure out what characters could occur directly after their match. So in the case of /^.*"/, the .* part knows that it will end right before a " and so does the equivalent of a rindex() to find the last " in the string. Then it lets the rest of the regex try to match. If that fails, then it backtracks to the previous ".
I always assumed that the non-greedy operators were optimized the same way. But based on your benchmarks, I've changed my mind. It looks like, in the case of /^.*?"/, that the .*? part doesn't compute what it might be followed by and just starts out matching nothing and letting the rest of the regex try to match. If this doesn't work, it forwardtracks to the next character. It should probably instead do the equivalent of index() at first and then forwardtrack to the next ", in this case.
Sorry, I don't have time to dig into the regex engine code right now. This would probably be an "easy" patch except for the fact that even easy patches to the regex engine require 12th-level dieties.
In reply to Re: How are we lazy?
by tye
in thread How are we lazy?
by Ovid
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |