Someone asked privately if the % in [^%>] was required. That is a good question, so I decided to answer it in public.
Without that %, we get:
which will match as follows:m[ <% # Opening delimiter. (?: # Match stuff that isn't a closing delim: [^%]+ # Things that can't start one. | %+[^>] # Might start one but isn't one. )* # As many non-closing-delims as you like. %> # Closing delimiter. ]x
so we've matched the whole string when we should have only matched the first part, "<% %%>"."<% %%> %>" "<%" matches <% " " matches [^%]+ so (?: ... )* has matched once "%%" matches %+ ">" fails on [^>] so we back-track "%" now matches %+ "%" matches [^>] so (?: ... )* has matched twice "> " matches [^%]+ so (?: ... )* has matched 3 times "%>" matches %> so regex finishes
By leaving the % out of [^%>], we've allowed the regex to back-track and match the first character of our delimiter (%) as the tail end of %+[^>].
But I now realize that my regex is also broken because it will never match:
at all. I'm tempted to fix it with:"<% %%>"
but that seems wrong. Think...m[ <% # Opening delimiter. (?: # Match stuff that isn't a closing delim: [^%]+ # Things that can't start one. | %+[^%>] # Might start one but isn't one. )* # As many non-closing-delims as you like. %* # PUNT! %> # Closing delimiter. ]x
Bah, I'm hours later for bed already. Serves my right for "showing off" "unrolling the loop" when I've seen *so many* *really good* regex slingers get this wrong more than once. (:
Unlike the last time I saw this happen, these nodes will *not* be updated to hide the mistakes I've made (that last time the updates were flying really fast and I was extremely frustrated by not being able to learn from the repeated mistakes).
- tyeIn reply to Re^2: A NOT in regular expressions (why [^%>]?)
by tye
in thread A NOT in regular expressions
by Anonymous Monk
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |