Left logical grammars such as you have listed are very mechanically converted to `recursive descent' parsers.

You might want to check out Parse::RecDescent. You should be able to go straight from your grammar to a working parser.

Otherwise, make yourself a next_token(), an eat_token() function, and to make things easier, a save_position() and restore_position() pair to allow you to easily put already eaten tokens back onto the front of the stream (not necessary with some grammars, see books on recursive descent and LL grammars for more) and write code like this:

sub expect_Elements { my $stream = shift; my $expression; $p = save_position($stream); if (next_token() eq "(") { eat_token($stream); $expression = [ expect_CCL_Find($stream) ]; # It might make more sense to die() here if (eat_token() ne ")") { $expression = undef; } } else { if ($expression = expect_Set($stream)) { } elsif ($expression = expect_Terms($stream)) { } elsif ($expression = expect_Qualifiers($stream)) { if ($rel = expect_Relation($stream)) { if (next_token($stream) eq "(") { eat_token(); $expression = [ expect_CCL_Find($stream) ]; (eat_token() eq ")") or $expression = undef; } elsif ( my $terms = expect_Terms($stream) ) { $expression = [ $rel, $expression, $terms ]; } else { $expression = undef; } } elsif (next_token($stream) eq "=") { eat_token(); $string1 = expect_string($stream); $op = next_token($stream); $string2 = expect_string($stream); if (!defined $string1 or !defined $string2 or !defined $op or $op ne "-") { $expression = undef; } else { $expression = [ "=", $expression, $string1, "-", $ +string2 ]; } } else { $expression = undef; } } else { $expression = undef; } } if (defined $expression) { return $expression; } else { restore_position($stream, $pointer); return undef; } }

I think it should be fairly obvious why you'd want a module (or flex/yacc) to write this for you for larger grammers, but every programmer should write at least one recursive descent parser :-).


In reply to Re: Parsing CCL (Common Command Language) commands by mugwumpjism
in thread Parsing CCL (Common Command Language) commands by e_bachmann

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.