dramguy has asked for the wisdom of the Perl Monks concerning the following question:

Hi, Please forgive me as I am rather new to the Parse::RecDescent module. I am working on a simple parser, which is rather easy to code using stand-alone perl, however I am trying to learn this powerful module. I am having trouble with my grammar and have looked at all the examples to help but to no avail. I basically want to be able to parse a structure similar to this:

dbSetCellPortTypes "/opt/mylib/95nm6M" "*" '( ("gnd!" "Inout" "Ground" ) ("vint!" "Inout" "Power" ) ) #f

Here is what I have so far:

use strict; use Parse::RecDescent; use Data::Dumper; $::RD_ERRORS = 1; $::RD_WARN = 1; $::RD_HINT = 1; $::RD_TRACE = 1; my $grammar = <<'_EOGRAMMAR_'; <autotree> QUOTED_STRING : /"/ <skip:""> quoted_char(s?) /"/ { " " . join "", @{$item[3]} # leading space flags a str +ing } quoted_char : /[^\\"]+/ | /\\n/ { "\n" } | /\\"/ { "\"" } portDefinition: "dbSetCellPortTypes" QUOTED_STRING QUOTED_STRIN +G portLists QUOTED_STRING portLists: "'" "(" list(s?) ")" list: "(" QUOTED_STRING ")" _EOGRAMMAR_ my $parse = Parse::RecDescent->new($grammar) or die "bad grammar"; undef $/; my $text = <<EOT; dbSetCellPortTypes "/opt/mylib/s956M" "*" '( ("gnd!" "Inout" "Ground" ) ("vint!" "Inout" "Power" ) ) #f EOT my $net = $parse->portDefinition($text) or die "bad netlist"; print Dumper[$net];

I am getting a bad netlist error and don't know how to debug from here. Any help is appreciated. Thanks, Frank

Replies are listed 'Best First'.
Re: Help with Parse::RecDescent grammar
by philcrow (Priest) on Dec 12, 2006 at 16:17 UTC
    I think there are two problems. First, the grammar does not allow multiple quoted strings in its list rule. Try this:
    list: "(" QUOTED_STRING(s) ")"
    Note the (s).

    Second, the test you are feeding does not parse since the trailing #f is not quoted and you asked for a trailing quoted string. If you quote it (as in "#f"), it will parse. Or you could think of a new trailing rule.

    Phil

      Hi Phil, Thanks for the quick response.

      I updated my code with your suggestions and it will now parse.

      However, the data tree which is built isn't clear to me at all.

      Is there a better way to store the parsed data so it is more meaningfule?

      $VAR1 = [ bless( { '__RULE__' => 'portDefinition', 'portLists' => bless( { '__RULE__' => 'portLists', 'list(s?)' => [], '__STRING1__' => '\'' }, 'portLists' ), 'QUOTED_STRING' => ' quoted_char=HASH(0x3a0c54)', '__STRING1__' => 'dbSetCellPortTypes' }, 'portDefinition' ) ];

      Thanks,

      Frank

        Start by not using <autotree>. You need to override it 90% of the time (100% in this case), so you might as well make everything explicit.

        use strict; use warnings; use Data::Dumper qw( Dumper ); use Parse::RecDescent qw( ); $::RD_ERRORS = 1; $::RD_WARN = 1; $::RD_HINT = 1; #$::RD_TRACE = 1; my $grammar = <<'_EOGRAMMAR_'; { # These apply to code in those block and all actions. use strict; use warnings; my %escapes = ( n => "\n", ); sub dequote_double { for (my $s = @_ ? $_[0] : $_) { s/^"//g; s/"$//g; s/\\(.)/$escapes{$1} || $1/eg; return $_; } } } parse : <skip:'(?:\s+|#[^\n]*\n)*'> portDef(s?) /\Z/ { $item[2] } portDef : "dbSetCellPortTypes" QSTRING QSTRING portList { [ @item[1..4] ] } portList : "'" "(" record(s?) ")" { $item[3] } record : "(" QSTRING(s?) ")" { $item[2] } QSTRING : /"(?:[^"\\]|\\.)*"/ { dequote_double($item[1]) } _EOGRAMMAR_ my $parser = Parse::RecDescent->new($grammar) or die "Bad grammar\n"; my $text = <<'_EOT_'; dbSetCellPortTypes "/opt/mylib/s956M" "*" '( ("gnd!" "Inout" "Ground" ) ("vint!" "Inout" "Power" ) ) #f _EOT_ my $net = $parser->parse($text) or die "bad netlist"; print Dumper $net;

        Other fixes:

        • I shortened some rule names. They made it hard to line up the productions.
        • It's important to check for end of input (/\Z/).
        • Your grammar didn't use strict or warnings, so I added them.
        • Properly handled comments using <skip>.
        • Changed the format. The one I used makes it easier to locate rules, and makes rule: this | that type rules easy to read. (Just line up the | with the :.)

        Update: Fixed portList. (Replaced { $item[2] } with { $item[3] }.)
        Update: Fixed comment and missing (s?) in parse.

        Well, autotree is easy in the grammar, but harder later. The key is to use your own actions. You started that with the action for QUOTED_STRING, but you need more. I got this:
        $VAR1 = [ { 'trailer' => ' #f', 'str2' => ' *', 'portLists' => undef, 'str1' => ' /opt/mylib/s956M' } ];
        from adding these actions:
        my $grammar = <<'_EOGRAMMAR_'; QUOTED_STRING : /"/ <skip:""> quoted_char(s?) /"/ { " " . join "", @{$item[3]} # leading space flags a str +ing } quoted_char : /[^\\"]+/ { $item[1] } | /\\n/ { "\n" } { $item[1] } | /\\"/ { "\"" } { $item[1] } portDefinition: "dbSetCellPortTypes" QUOTED_STRING QUOTED_STRING portLists QUOTED_STRING { return { str1 => $item[2], str2 => $item[3], portLists => $item[4], trailer => $item[5] } } portLists: "'" "(" list(s?) ")" { $item[2] } list: "(" QUOTED_STRING(s) ")" { $item[2] } _EOGRAMMAR_
        I'd have better names for the string keys, if I knew what they were for.

        Phil