Finally spending time working on a Parse::RecDescent grammar, but am having trouble with low level productions that are very similar, so the wrong one is often picked, causing the parse to fail. Reading the docs for the module indicates that maybe using <score: ...> might be useful, but I'm not clear on how I could take advantage of it.

The issue seems to be that the statements I'm trying to parse take one of three forms:

Here's example code I've got, with __DATA__ at the end:

use strict; use warnings; use Parse::RecDescent; use Data::Dumper; my $grammar = <<'EOG' <autotree> VCSConfig: statement(s) statement: clause | def clause: "include" pathname # Include Clause # NOTE: May not have any attributes... def: "cluster" name "(" Attr(s?) ")" | "system" name "(" Attr(s?) ")" # Pathname may or may not be surrounded by double quotes pathname: dquote(?) /([^"]+)/ dquote(?) { $return = $1; } dquote: /"/ name: /\w+/ Attr: AttrScalar(s?) | AttrKeyList(s?) | AttrAssociation(s?) AttrScalar: attribute '=' string AttrKeyList: attribute '=' keylist AttrAssociation: attribute '=' association attribute: /[a-zA-Z][\w@]+/ # allow '@' in attr name # NOTE: separator can be either of ',' or ';' keylist: '{' <leftop: string /[,;]/ string> '}' association: '{' <leftop: key_value /[,;]/ key_value> '}' key_value: string '=' string string: /[a-zA-Z]\w+/ EOG my ($vcs_config); my ($vcs_parse) = Parse::RecDescent->new( $grammar ); my ($vcs_config) = do { local $/; <DATA>; }; my ($orig_config) = $vcs_parse->VCSConfig( $vcs_config ); print Dumper $orig_config; __DATA__ include "types.cf" include "LBSybase.cf" include "OracleTypes.cf" cluster vcs ( UserNames = { vcs = X1Nh6WIWs6ATQ } Administrators = { vcs } CounterInterval = 5 ) system njengsunvcs1 ( ) system njengsunvcs2 ( )

The include clauses are parsed with no problem, but as soon as I hit the cluster clause, everything starts to break down, because I can't figure out how to get the grammar to properly differentiate between the Association, KeyList, and Scalar assignments within that clause.

Would the <score: ...> directive help me here? Or is there a much simpler way to get the grammar in line?

Note that this is just a small snippet of the config file in my example - the actual file I'm trying to parse is hundreds of lines long and has several other clause types, but they all have the same attribute types I'm trying to parse here - so this isn't really an easy regex problem either.


In reply to Parse::RecDescent Grammar Questions by gmarler

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.