in reply to Re: Re: parsing question
in thread Re: parsing question

What you say is not true in Perl from version 5.6 onwards. The RE engine does have the power to solve this problem.

Of course the "Regular" in RE has been a misnomer for ages, but that is another story.

Replies are listed 'Best First'.
Re: Re (tilly) 3: parsing question
by mstone (Deacon) on Jan 08, 2002 at 05:43 UTC


    > The RE engine does have the power to solve this problem.

    Hrm.. please demonstrate. I know you can build a lexer by looping over m/\G$regexp/g, but that doesn't give you the state storage neccessary to balance parens. If you could whip up something to convert the second-level parens in:

    (((())()))(()(()()))

    to square brackets:

    ([(())()])([][()()])

    using only regexps.. no variables, no recursion (and for arbitrary strings of nested parens, of course).. I'd love to add the technique to my bag of tricks.

      You just changed the problem massively. I didn't promise that the RE engine magically has become able to handle arbitrary parsing problems. I never said that no variables were involved. I merely said that the RE engine can handle balanced delimiters.

      The two experimental features which are needed are (??{}) for delayed evaluation and (?>) for telling the engine not to backtrack. A sample script demonstrating the technique:

      #! /usr/bin/perl my $braces; $braces = qr/(\((?:(?>[^\(\)]+)|(??{$braces}))*\))/; while (<>) { if ($_ =~ $braces) { print "Matched '$1'\n\n"; } else { print "No match\n\n"; } }
      Run it and start typing in lines. The ones with balanced parentheses will match.
      One example of this sort of thing is here. Look up perlre, and search the text for "postponed".
         MeowChow                                   
                     s aamecha.s a..a\u$&owag.print

        Interesting.. thanks.