Here's the work I've done so far. I can't test this at the moment (since I can't get Parse::RecDescent at this time), but I intend to come back to this.Tested. I'll even post a version that doesn't use Parse::RecDescent after testing this version.

My code should output a structure of the form

# ( a OR b ) AND ( c ) NOT ( d ) AND ( e OR f ) [ [ 'a', 'b' ], [ AND => 'c' ], [ NOT => 'd' ], [ AND => 'e', 'f' ], ]

Parser creator:

# make_parser.pl use strict; use warnings; use Parse::RecDescent (); my $grammar = <<'__END_OF_GRAMMAR__'; { use strict; use warnings; } parse : expr /^\Z/ { $item[1] } expr : terms expr_ { [ $item[1], $item[2] ] } expr_ : AND terms expr_ { [ [ $item[1], @{$item[2]} ], @{$item[3]} + ] } | NOT terms expr_ { [ [ $item[1], @{$item[2]} ], @{$item[3]} + ] } | { [] } terms : '(' terms_ ')' { $item[2] } terms_ : IDENT OR terms_ { [ $item[1], @{$item[3]} ] } | IDENT { [ $item[1] ] } IDENT : /\S+/ AND : IDENT { $item[1] eq 'AND' ?1:undef } { $item[1] } NOT : IDENT { $item[1] eq 'NOT' ?1:undef } { $item[1] } OR : IDENT { $item[1] eq 'OR' ?1:undef } { $item[1] } __END_OF_GRAMMAR__ Parse::RecDescent->Precompile($grammar, "QueryParser") or die("Bad grammar\n");

Sample application:

# query_parser.pl use strict; use warnings; use Data::Dumper qw( ); use QueryParser qw( ); sub display { my ($expr, $tree) = @_; local $Data::Dumper::Indent = 0; print(Data::Dumper->Dump([$tree], [$expr]), "\n"); } { my $parser = QueryParser->new(); foreach my $expr ( '( keyword1 )', '( keyword1 ) AND ( keyword2 )', '( keyword1 ) AND ( keyword2 ) AND ( keyword3 )', '( keyword1 ) NOT ( keyword2 )', '( keyword1 ) NOT ( keyword2 ) NOT ( keyword3 )', '( keyword1 ) AND ( keyword2 ) NOT ( keyword3 )', '( keyword1 ) NOT ( keyword2 ) AND ( keyword3 )', '( keyword1 OR keyword2 )', '( keyword1 ) AND ( keyword2 OR keyword3 )', '( keyword1 ) NOT ( keyword2 OR keyword3 )', ) { my $tree = $parser->parse($expr); display($expr, $tree); } }

It could use some better error reporting. I'll probably only do that in the version that doesn't use Parse::RecDescent (since PRD is so slow).

It could be optimized a bit.

I'm kinda curious about the parens. They are superfluous.

Update: Tested. I made a couple of small fixes.


In reply to Re: Logical expressions by ikegami
in thread Logical expressions by rsiedl

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.