in reply to Re^6: This regexp made simpler
in thread This regexp made simpler

I haven't really thought about whether the solutions with .*? are actually incorrect, but most of them will almost certainly go wrong if you extend the regex latern on with something that might force backtracking on the preceeding construct.

Example:

$ perl -wE 'say "yes" if "A BCZD ZA" =~ /^A(\s.*?)?ZA/' yes
Here I added an A to the end, which causes backtracking when there's no A after the first Z. Which in turn allows a match that was forbidden by your rules.

(Update: This is a general problem when translating "may not occur inbetween" to "minimum match": it's only the same under certain very fixed conditions. You can "rescue" such a solution by putting it in (?>...) non-backtracking groups, but I still recommend against it).

So maybe your example wasn't actually wrong (and I apologize for having called it so without any proof), but it's surely not very maintainable, because a very simple, innocent change can break it.

Replies are listed 'Best First'.
Re^8: This regexp made simpler
by rovf (Priest) on Apr 26, 2010 at 12:06 UTC
    I see the problem. Thanks for pointing this out!

    -- 
    Ronald Fischer <ynnor@mm.st>
Re^8: This regexp made simpler
by rubasov (Friar) on Apr 26, 2010 at 14:31 UTC
    To satisfy my curiosity I wrote the backtracking control version you mentioned and I was surprised how simple it is:
    while (<DATA>) { print; s[ ^ (START) ( | \s.*? ) (END) (*COMMIT) $ ] [ $1 . $2 . 'insert' . $3 ]ex; print; } __DATA__ STARTEND STARTENDEND START SOMETHING END STARTSOMETHINGEND START END START ENDEND STARTSTARTENDEND STARTSTART ENDEND

    For comparison here's the other form:

    s[ ^ (START) ( | \s (?:(?!END).)* ) (END) $ ] [ $1 . $2 . 'insert' . $3 ]ex;