amw1 has asked for the wisdom of the Perl Monks concerning the following question:

Are there any modules that deal with building a prefix notation boolean logic tree? Maintaining the order is important. A LoL is preferred. I have a feeling that this is something that someone has already done somewhere. I have some thoughts on how to start on this but there may be cases I'm missing. The goal is not to represent any type of tree but one that a user would enter rule by rule. i.e. user enters the first clause. User ands a new clause to the end. User or's a new clause to the end of that. User and's a new clause to the end of that. Should build a tree using standard precedence rules.
I don't know if I'm explaining myself that well, it's late and I've been staring at other parts of this for too long. :) Anyway, here's an example of the type of list I'd like to create. I searched cpan for prefix notation logic trees, googled for the same. No real success. Anyone ever done something like this or have pointers to an algorithim to help construct this thing? I have some thoughts on how to start it but I figure the more input I get the less chance of making a mistake. 'sides some of you guys are hella smart. :)
# user adds first clause [==,a,b] # implies if a == b, prefix notation # and a new clause [and,[==,a,b],[==,c,d]] # or a new clause [or,[and,[==,a,b],[==,c,d]],[==,e,f]] # and a new clause [or,[and,[==,a,b],[==,c,d]],[and,[==,e,f],[==,g,h]] etc
This would be the representation of the above in infix notation
((a==b)and(c==d))or((e==f)and(g==h))

Replies are listed 'Best First'.
Re: prefix notation in an array
by kvale (Monsignor) on Mar 26, 2004 at 07:10 UTC
    Problems like this involve two parts: parsing and syntax-directed translation. So the first step is to create a grammar for your infix expression:
    # infix operators: ==, or, and # grouping: () # terminals : \w+ # # term := (expr) # | \w+ # expr := term /and/ term # | term /or/ term # | term /==/ term
    Parsing expressions usuing this grammar will generate a parse tree. Then you can walk the parse tree to create the prefix notation. In our case, the grammar is so simple, we can translate as we go along:
    # infix operators: ==, or, and # grouping: () # terminals : \w+ # # term := (expr) {print '[' . $expr . ']'} # | \w+ {print $word} # expr := term /and/ term {print 'and,' . $term1 . ',' . $term2} # | term /or/ term {print 'or,' . $term1 . ',' . $term2} # | term /==/ term {print '==,' . $term1 . ',' . $term2}
    The bits in {} are called actions and you can take them as soon as you satisfy the given parse rule.

    Given the grammar and actions, your task is (you didn't think I would take all the fun out of it, did you?) to create a recursive descent program implementing this grammar. The grammar is simple enough that you could hand-roll your own. Or you could use Parse::RecDescent to automatically generate a program from your grammar (you may have to tweak the grammar to satisfy P::RD's grammar format).

    Have fun!

    -Mark

Re: prefix notation in an array
by tzz (Monk) on Mar 26, 2004 at 14:28 UTC
    I've done something similar in my cfperl program, which parses cfengine rules that look like this (a AND b OR c):
    a.b|c

    I use Parse::RecDescent to turn text input directly into a parsed optree, which is then interpreted at runtime because parts of the tree change value depending on what's defined at the time. You can probably do the same with prefix instead of infix notation - the grammar changes are minor.

    HTH

    Ted

Re: prefix notation in an array
by tsee (Curate) on Mar 26, 2004 at 14:56 UTC

    Update: I am so sorry. I didn't read your post thoroughly enough. This is actually an infix-expression parser. I posted code that solves your problem in a reply to this post. -- Steffen

    The following code should solve your problem including operator precedence (and > or > == > !).

    Note, however, that it will be very slow when dealing with long strings.

    use strict; use warnings; use Data::Dumper; use Parse::RecDescent; my $grammar = <<'GRAMMAR'; boolean_expr: and_expr { $item[1] } | terminal { $item[1] } | <error> and_expr: <leftop: or_expr 'and' or_expr> { my $ary = $item[1]; my $return = $ary->[0]; foreach (@{$ary}[1..$#$ary]) { $return = ['and', $return, $_]; } $return; } or_expr: <leftop: eq_expr 'or' eq_expr> { my $ary = $item[1]; my $return = $ary->[0]; foreach (@{$ary}[1..$#$ary]) { $return = ['or', $return, $_]; } $return; } eq_expr: <leftop: unary_expr '==' unary_expr> { my $ary = $item[1]; my $return = $ary->[0]; foreach (@{$ary}[1..$#$ary]) { $return = ['==', $return, $_]; } $return; } unary_expr: '!' term { ['!', $item[2]] } | term { $item[1] } term: '(' boolean_expr ')' { $item[2] } | terminal { $item[1] } terminal: /\w+/ { $item[1] } GRAMMAR my $parser = Parse::RecDescent->new($grammar); while (<STDIN>) { chomp; my $parsed; $parsed = $parser->boolean_expr($_); next unless defined $parsed; print Dumper $parsed; }

    For a more involved example of a Parse::RecDescent grammar for parsing similar expressions, have a look at the Math::Symbolic::Parser module (algebraic expressions in that case).

    Steffen

      I am sorry, the code that actually solves the outlined problem is the following. You can enter prefix boolean expressions in brackets ( "[and, a, [==, b, [!, c]]]" ).
      After you entered the first valid expression, you may instead add new expressions as "</code>or, d" which will be applied to the previous expression. Entering "<code>!" negates the current expression.

      use strict; use warnings; use Data::Dumper; use Parse::RecDescent; my $grammar = <<'GRAMMAR'; boolean_expr: '[' /and|or|==/ ',' boolean_expr ',' boolean_expr ']' { [ $item[2], $item[4], $item[6] ] } | '[' '!' ',' boolean_expr ']' { [ '!', $item[4] ] } | /and|or|==/ ',' boolean_expr { [ $item[1], $item[3] ] } | '!' | terminal { $item[1] } | <error> terminal: /\w+/ { $item[1] } GRAMMAR my $parser = Parse::RecDescent->new($grammar); my $old; while (<STDIN>) { chomp; my $parsed = $parser->boolean_expr($_); next unless defined $parsed; if (ref $parsed eq 'ARRAY' and @$parsed == 2 and $parsed->[0] =~ /and|or|==/ ) { warn("Cannot add clause if no " . "starting term was defined."), next if not defined $old; $old = [$parsed->[0], $old, $parsed->[1]]; } elsif ($parsed eq '!') { warn("Cannot add clause if no " . "starting term was defined."), next if not defined $old; $old = ['!', $old]; } else { $old = $parsed; } print Dumper $old; }
      Update: Fixed line breaks for display.

      Steffen

      I played around with RecDescent a bit last night after posting. I think for the types of things I'm trying to express it may be more than I need (though it will definately do it) I have pretty strict control over how these things get added (adding clauses via a web form) so I don't need to deal with the vaugeries of human typed input. I came up with this which appears to be working for me. Just a fragment that's going to be inserted into a larger bit of code, no error handling/strict etc. Just a poc at this point.
      #!/usr/bin/perl + + use Data::Dumper; my $clauses; push(@$clauses, ["==","a","b"]); push(@$clauses, "and"); push(@$clauses, ["==","c","d"]); push(@$clauses, "or"); push(@$clauses, ["==","e","f"]); push(@$clauses, "and"); push(@$clauses, ["==","g","h"]); $Data::Dumper::Indent = 0; + + my $list = BuildList($clauses); print Dumper($list). "\n"; sub BuildList($) { my $clauses = shift(); my $list; while(my $clause = shift(@$clauses)) { if($clause eq "and") { $list = ['AND',[$list], [shift(@$clauses)]]; } elsif($clause eq "or") { $list = ["OR",[$list, BuildList($clauses)]]; } else { $list = $clause } } return($list); }
      This gives me
      ['OR',[['AND',[['==','a','b']],[['==','c','d']]],['AND',[['==','e','f' +]],[['==','g','h']]]]]
      Thanks for all the input.
      Update: It's also important to note, that I don't really care about what's in each clause. The clauses are going to be pretty tightly controlled by the creation mechanisim.