in reply to Re: Matching nested begin/ends
in thread Matching nested begin/ends

It is not.

We simply test differently; I tested with something like this (using your input):

my$re = qr/ begin (?: (?> [^be]* ) |(??{ $re }) | [be] )* end /x; foreach (<DATA>) { chomp; my @matches = $_ =~ /($re)/g; print qq(For "$_":\n\t); print (@matches ? join("*",@matches) : "no matches", "\n"); } __DATA__ begin end begin en begin nd begin begin end end beginend beginbeginbeginendendend begin begin end begin begin end begin end end end begin begin end begin egin end begin end end end begin end begin end

Which prints:

For "begin end": begin end For "begin en": no matches For "begin nd": no matches For "begin begin end end": begin begin end end For "beginend": beginend For "beginbeginbeginendendend": beginbeginbeginendend For "begin begin end begin begin end begin end end end": begin begin end begin begin end begin end end For "begin begin end begin egin end begin end end end": begin begin end begin egin end begin end end For "begin end begin end": begin end*begin end

Replies are listed 'Best First'.
Re: Matching nested begin/ends
by Abigail-II (Bishop) on Aug 02, 2002 at 09:30 UTC
    That just means your test isn't good enough. You are testing whether "begin end end begin" *contains* a matched begin/end pair. My test however anchor the regex to the beginning and end, and hence correctly flag "begin end begin end" as to *not* be a nested begin/end construct.

    It's the same that /\d/ is *not* a correct regex to test if a string is a number. It's a test to see if a string contains a number. But if all you want to know is whether a string contains a begin/end delimited substring, all you need is /begin.*end/. No recursion required.

    Abigail

      I again disagree. Delimited text is generally part of a larger document that needs to be processed. I can see your point that anchoring the regex is the best way to fully verify the text item's syntax. However, why would this be needed? The item must have followed some pattern to be extracted in the first place.

      Also, /begin.*end/ will allow a begin followed by an end, which will allow an item like beginbeginend to pass with no problems; hence recursion is needed.

        Also, /begin.*end/ will allow a begin followed by an end, which will allow an item like beginbeginend to pass with no problems;
        So does your regex:
        $re = /begin (?: (?>[^be]*) |(??{ $re }) | [be] )* end/x; "begin begin end" =~ /$re/ and print "<$&>\n"; __END__ <begin begin end>
        Abigail