The general question is: given a regexp fragment like
is is possible for the prefix and suffix to be simultaneously (and repeatedly) empty? If so, then anything that matches /(meat){n}/ will be able to match at least 2^n different ways.( prefix (meat)* suffix )*
I'm sure that at the least a subset of exponential regexps should be discernible. I imagine the first stage of such an approach would be to iteratively strip out anything that can be zero-width, in which case it would be rather harder to detect that eg:
is linear. Ask me again when Perl6 has a full grammar implementation - it might be an interesting project to write such a detector targetting a pluggable regexp grammar./( (foo)* (?=b) (bar)? )* baz )/x
Note that one generic approach to fixing such regexps is to use the 'cut' operator (?> ...), but as far as I am aware that operator is available only in perl itself and the ever-faithful PCRE. For example:
avoids the exponential behaviour by disallowing backtracking into the [^']*./( (?> [^']* ) ('[^']*')? )* /x
Hugo
In reply to Re^3: This regex seems to have splattered non-greedy everywhere
by hv
in thread This regex seems to have splattered non-greedy everywhere
by fizbin
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |