in reply to Re: Help with Parse::RecDescent grammar
in thread Help with Parse::RecDescent grammar

Hi Phil, Thanks for the quick response.

I updated my code with your suggestions and it will now parse.

However, the data tree which is built isn't clear to me at all.

Is there a better way to store the parsed data so it is more meaningfule?

$VAR1 = [ bless( { '__RULE__' => 'portDefinition', 'portLists' => bless( { '__RULE__' => 'portLists', 'list(s?)' => [], '__STRING1__' => '\'' }, 'portLists' ), 'QUOTED_STRING' => ' quoted_char=HASH(0x3a0c54)', '__STRING1__' => 'dbSetCellPortTypes' }, 'portDefinition' ) ];

Thanks,

Frank

Replies are listed 'Best First'.
Re^3: Help with Parse::RecDescent grammar
by ikegami (Patriarch) on Dec 12, 2006 at 16:58 UTC

    Start by not using <autotree>. You need to override it 90% of the time (100% in this case), so you might as well make everything explicit.

    use strict; use warnings; use Data::Dumper qw( Dumper ); use Parse::RecDescent qw( ); $::RD_ERRORS = 1; $::RD_WARN = 1; $::RD_HINT = 1; #$::RD_TRACE = 1; my $grammar = <<'_EOGRAMMAR_'; { # These apply to code in those block and all actions. use strict; use warnings; my %escapes = ( n => "\n", ); sub dequote_double { for (my $s = @_ ? $_[0] : $_) { s/^"//g; s/"$//g; s/\\(.)/$escapes{$1} || $1/eg; return $_; } } } parse : <skip:'(?:\s+|#[^\n]*\n)*'> portDef(s?) /\Z/ { $item[2] } portDef : "dbSetCellPortTypes" QSTRING QSTRING portList { [ @item[1..4] ] } portList : "'" "(" record(s?) ")" { $item[3] } record : "(" QSTRING(s?) ")" { $item[2] } QSTRING : /"(?:[^"\\]|\\.)*"/ { dequote_double($item[1]) } _EOGRAMMAR_ my $parser = Parse::RecDescent->new($grammar) or die "Bad grammar\n"; my $text = <<'_EOT_'; dbSetCellPortTypes "/opt/mylib/s956M" "*" '( ("gnd!" "Inout" "Ground" ) ("vint!" "Inout" "Power" ) ) #f _EOT_ my $net = $parser->parse($text) or die "bad netlist"; print Dumper $net;

    Other fixes:

    • I shortened some rule names. They made it hard to line up the productions.
    • It's important to check for end of input (/\Z/).
    • Your grammar didn't use strict or warnings, so I added them.
    • Properly handled comments using <skip>.
    • Changed the format. The one I used makes it easier to locate rules, and makes rule: this | that type rules easy to read. (Just line up the | with the :.)

    Update: Fixed portList. (Replaced { $item[2] } with { $item[3] }.)
    Update: Fixed comment and missing (s?) in parse.

      Hello, Thanks for the reply.

      Does the  /\Z/ match the #f at the end of the DATA?

      Here is the output I get:

      $VAR1 = [ 'dbSetCellPortTypes', '/opt/mylib/s956M', '*', '(' ];

      Looks like the portLists are not being returned correctly...could this have something to do with the #f? Thanks,

      Frank

        Here is the output I get:

        Oops, portList should be

        portList : "'" "(" record(s?) ")" { $item[3] }

        I made a minor change (undid my mergine of ' and ( into a single token) right before posting without testing.

        By the way, "'(" means no whitespace (as defined by <skip>, \s* by default) is allwed between the two characters, while "'" "(" means whitespace IS allowed.

        could this have something to do with the #f?

        Is that a commment? I treated it as a comment, so I used <skip> to handle it.

        If it's not a comment, what is it? An unquoted string? Does it support any kind of escapes?

Re^3: Help with Parse::RecDescent grammar
by philcrow (Priest) on Dec 12, 2006 at 16:53 UTC
    Well, autotree is easy in the grammar, but harder later. The key is to use your own actions. You started that with the action for QUOTED_STRING, but you need more. I got this:
    $VAR1 = [ { 'trailer' => ' #f', 'str2' => ' *', 'portLists' => undef, 'str1' => ' /opt/mylib/s956M' } ];
    from adding these actions:
    my $grammar = <<'_EOGRAMMAR_'; QUOTED_STRING : /"/ <skip:""> quoted_char(s?) /"/ { " " . join "", @{$item[3]} # leading space flags a str +ing } quoted_char : /[^\\"]+/ { $item[1] } | /\\n/ { "\n" } { $item[1] } | /\\"/ { "\"" } { $item[1] } portDefinition: "dbSetCellPortTypes" QUOTED_STRING QUOTED_STRING portLists QUOTED_STRING { return { str1 => $item[2], str2 => $item[3], portLists => $item[4], trailer => $item[5] } } portLists: "'" "(" list(s?) ")" { $item[2] } list: "(" QUOTED_STRING(s) ")" { $item[2] } _EOGRAMMAR_
    I'd have better names for the string keys, if I knew what they were for.

    Phil