in reply to Help with Parse::RecDescent grammar

I think there are two problems. First, the grammar does not allow multiple quoted strings in its list rule. Try this:
list: "(" QUOTED_STRING(s) ")"
Note the (s).

Second, the test you are feeding does not parse since the trailing #f is not quoted and you asked for a trailing quoted string. If you quote it (as in "#f"), it will parse. Or you could think of a new trailing rule.

Phil

Replies are listed 'Best First'.
Re^2: Help with Parse::RecDescent grammar
by dramguy (Novice) on Dec 12, 2006 at 16:27 UTC
    Hi Phil, Thanks for the quick response.

    I updated my code with your suggestions and it will now parse.

    However, the data tree which is built isn't clear to me at all.

    Is there a better way to store the parsed data so it is more meaningfule?

    $VAR1 = [ bless( { '__RULE__' => 'portDefinition', 'portLists' => bless( { '__RULE__' => 'portLists', 'list(s?)' => [], '__STRING1__' => '\'' }, 'portLists' ), 'QUOTED_STRING' => ' quoted_char=HASH(0x3a0c54)', '__STRING1__' => 'dbSetCellPortTypes' }, 'portDefinition' ) ];

    Thanks,

    Frank

      Start by not using <autotree>. You need to override it 90% of the time (100% in this case), so you might as well make everything explicit.

      use strict; use warnings; use Data::Dumper qw( Dumper ); use Parse::RecDescent qw( ); $::RD_ERRORS = 1; $::RD_WARN = 1; $::RD_HINT = 1; #$::RD_TRACE = 1; my $grammar = <<'_EOGRAMMAR_'; { # These apply to code in those block and all actions. use strict; use warnings; my %escapes = ( n => "\n", ); sub dequote_double { for (my $s = @_ ? $_[0] : $_) { s/^"//g; s/"$//g; s/\\(.)/$escapes{$1} || $1/eg; return $_; } } } parse : <skip:'(?:\s+|#[^\n]*\n)*'> portDef(s?) /\Z/ { $item[2] } portDef : "dbSetCellPortTypes" QSTRING QSTRING portList { [ @item[1..4] ] } portList : "'" "(" record(s?) ")" { $item[3] } record : "(" QSTRING(s?) ")" { $item[2] } QSTRING : /"(?:[^"\\]|\\.)*"/ { dequote_double($item[1]) } _EOGRAMMAR_ my $parser = Parse::RecDescent->new($grammar) or die "Bad grammar\n"; my $text = <<'_EOT_'; dbSetCellPortTypes "/opt/mylib/s956M" "*" '( ("gnd!" "Inout" "Ground" ) ("vint!" "Inout" "Power" ) ) #f _EOT_ my $net = $parser->parse($text) or die "bad netlist"; print Dumper $net;

      Other fixes:

      • I shortened some rule names. They made it hard to line up the productions.
      • It's important to check for end of input (/\Z/).
      • Your grammar didn't use strict or warnings, so I added them.
      • Properly handled comments using <skip>.
      • Changed the format. The one I used makes it easier to locate rules, and makes rule: this | that type rules easy to read. (Just line up the | with the :.)

      Update: Fixed portList. (Replaced { $item[2] } with { $item[3] }.)
      Update: Fixed comment and missing (s?) in parse.

        Hello, Thanks for the reply.

        Does the  /\Z/ match the #f at the end of the DATA?

        Here is the output I get:

        $VAR1 = [ 'dbSetCellPortTypes', '/opt/mylib/s956M', '*', '(' ];

        Looks like the portLists are not being returned correctly...could this have something to do with the #f? Thanks,

        Frank

      Well, autotree is easy in the grammar, but harder later. The key is to use your own actions. You started that with the action for QUOTED_STRING, but you need more. I got this:
      $VAR1 = [ { 'trailer' => ' #f', 'str2' => ' *', 'portLists' => undef, 'str1' => ' /opt/mylib/s956M' } ];
      from adding these actions:
      my $grammar = <<'_EOGRAMMAR_'; QUOTED_STRING : /"/ <skip:""> quoted_char(s?) /"/ { " " . join "", @{$item[3]} # leading space flags a str +ing } quoted_char : /[^\\"]+/ { $item[1] } | /\\n/ { "\n" } { $item[1] } | /\\"/ { "\"" } { $item[1] } portDefinition: "dbSetCellPortTypes" QUOTED_STRING QUOTED_STRING portLists QUOTED_STRING { return { str1 => $item[2], str2 => $item[3], portLists => $item[4], trailer => $item[5] } } portLists: "'" "(" list(s?) ")" { $item[2] } list: "(" QUOTED_STRING(s) ")" { $item[2] } _EOGRAMMAR_
      I'd have better names for the string keys, if I knew what they were for.

      Phil