in reply to ERROR ... nested quantifiers in regex

Not only possessive quantifiers, but also the  (?1) family of extended patterns were introduced with 5.10. However, the 'recursive parsing' trick can still be done with 5.8, so if 5.10/5.12 cannot be installed, check back here for more info. (But see Update below.)

(See Text::Balanced for all of the following functionality – and there's more!)

Following code requires 5.10+.

I find it useful to decompose regexes. (Closing sequence arbitrarily redefined to ']]' in example; could be any multi-character sequence.)

>perl -wMstrict -le "my $open = '{{'; my $close = ']]'; ;; my $opener = qr{ \Q$open\E }xms; my $closer = qr{ \Q$close\E }xms; my $body = qr{ [^\Q$open$close\E] }xms; ;; my $regex = qr{ ( $opener (?: $body++ | (?1) )* $closer ) }xms; ;; my $s = 'xxx {{ foo {{ bar ]] baz ]] yyy {{ fee ]] zzz'; ;; print qq{'$1'} while $s =~ m{ $regex }xmsg; " '{{ foo {{ bar ]] baz ]]' '{{ fee ]]'

This approach breaks down when we alter the string being searched to
    my $s =
      'xxx {{ foo {{ bar ]] baz [OK] ]] [NO] yyy {{ fee ]] zzz';
producing the output
    '{{ bar ]]'
    '{{ fee ]]'
because of the presence of the substring '[OK]' having the character ']' from the closing sequence.

This problem can be fixed by changing the definition of  $body to
    my $body = qr{ (?! $opener) (?! $closer) . }xms;
which restores the output to the expected
    '{{ foo {{ bar ]] baz [OK] ]]'
    '{{ fee ]]'
again.

Update: Oh, what the heck... Here's the 5.8.9 version:

>perl -wMstrict -le "print qq{perl version $]}; ;; my $opener = qr{ \{\{ }xms; my $closer = qr{ \]\] }xms; my $body = qr{ (?! $opener) (?! $closer) . }xms; ;; use re 'eval'; our $regex = qr{ $opener (?: (?> $body+) | (??{ $regex }) )* $closer }xms; ;; my $s = 'xxx {{ foo {{ bar ]] baz [OK] ]] [NO] yyy {{ fee ]] zzz'; ;; print qq{'$1'} while $s =~ m{ ($regex) }xmsg; " perl version 5.008009 '{{ foo {{ bar ]] baz [OK] ]]' '{{ fee ]]'

Replies are listed 'Best First'.
Re^2: ERROR ... nested quantifiers in regex
by ŞuRvīvőr (Novice) on Jul 07, 2011 at 07:53 UTC

    Would you please provide some notations to the code you just posted !! because I'm still a newbie in regex especially the freaky nested regex.

    Besides, Do you have any simple reference that helps understanding regex with all its tricks !!

      Would you please provide some notations to the code you just posted ...

      I don't have time at the moment, but will try to do so tomorrow.

      ... simple reference that helps understanding regex with all its tricks ...

      Believe me, brother, there ain't no such thing! If there's one thing regular expressions are not, it's simple. (They're also not regular.) The following  perldoc and on-line documentation should be helpful: perlre, perlretut, perlreref, perlfaq6, perlrecharclass, perlrebackslash. Jeffrey Friedl's book is excellent – and priced accordingly: Mastering Regular Expressions. See also the Tutorials section of the Monastery.

      Would you please provide some notations to the code you just posted ...

      I have taken a look at the discussions of the  (??{ code }) (5.8 and 5.10+) and  (?PARNO) (5.10+ only) constructs in the Extended Patterns section of perlre and I must say that while they are not ideal for a new learner of regex, they are better than anything I could provide.

      Let me suggest that you (re-)read these sections thoroughly, ponder the replies to your posts above, do some (better yet, lots of) experiments, and then come back to the Monastery and post any remaining or new questions you may have. If you do all this right, you should have lots of questions because regular expressions are very powerful and can be quite subtle. Don't expect to learn it all overnight, but the effort you invest will be well rewarded.

      BTW: If you have any more regex questions, be sure to mention the version of Perl you settle on using.