in reply to Re^4: Help with Double Double Quotes regular expression (imprecise)
in thread Help with Double Double Quotes regular expression
Thanks. Excellent points.
Back to /A(B|C)*D/, I have an ambiguity after matching B between matching a longer version of B or starting a new match against B. This ambiguity is harder to notice because it is a choice between B and B.
So, we can remove that ambiguity explicitly via:
/"((?:[^"]+(?=")|"")*)"(?!")/ # ^^^^^
which is at least similar to
/"((?:(?>[^"]+)|"")*)"(?!")/ # ^^^ ^
but I still like being explicit over disabling backtracking, because I learn things (like what you taught me). Thanks again.
I (now) recall discussions of this before where it was noted (by tilly) that Perl's regex engine was smarter than many in knowing how to avoid pathological performance in something at least similar to this situation. But checking "use re 'debug';" I see that the pathological case is indeed being triggered.
Update: I've been known to do the equivalent of:
if( /"((?:[^"]+|"")*)("?)(?!")/ ) { # ^ ^^ die "Unclosed quote ($1)..." if ! $2;
which I think also gets rid of the problem.
- tye
|
|---|