I found a couple of issues with the grammar besides the return undef problem. In a couple of cases you wrote the grammar as though it would be handled the same way that yacc would. For instance,
This is bad because because it will cause 'foo <= 10' to be parsed as 'foo < = 10'. Replacing it by a single regex to match the appropriate options is better here (and faster).relop : '<' | '<='
Another one was the way you handled the label. If the opcode part following a label fails the parse, the label is still assigned. A similar (and very common mistake) occurs in your startrule. Your grammar can match a sentence that starts off valid and then degenerates. Perhaps this is the behaviour you wanted, but im guessing you werent aware. (Incidentally I dont recall that its well documented but the startrule doesnt have to be called that, you can call $parser->any_defined_rule($foo) and have it start parsing from that rule. Which can be useful for debugging a particular rule, but also makes using the parser look a little nicer.)
If you uncomment the various rules and comment out their complements then youll see the failures occur (you really should trap to see if the parse failed :-). I also cleaned things up a bit. Now a single rule 'identifier' fills in for all the places where label_name and var_name were being used.
Anyay, interesting stuff, I kinda wonder about some things in it, I dont understand why the variables in if are evaluated at compile time and not run time. Also why there isnt more checking of values in the label and symbol table. Im kinda interested now to review your latest post. :-)
#!/usr/bin/perl use strict; use warnings; use Data::Dumper; use Parse::RecDescent; $::RD_ERRORS=1 || undef; $::RD_WARN =1 || undef; $::RD_HINT =0 || undef; $::RD_TRACE =0 || undef; use vars qw( %symbol %axis @code $curr %label ); @axis{'X','Y','Z'} = (0,0,1); @code = (); sub fetchlabel { my $label_name=shift; die "Undefined \$label_name?\n" unless defined $label_name; my $val=$label{$label_name}; unless (defined $val) { warn "Label '$label_name' unknown \n"; # do something with undef $labelval $val="*UNKNOWN ($label_name*"; } return $val; } sub fetchsymbol { my $symbol_name=shift; die "Undefined \$symbol_name?\n" unless defined $symbol_name; my $val=$symbol{$symbol_name}; unless (defined $val) { warn "Symbol '$symbol_name' not defined, treating as 0\n"; $val=0 } return $val; } my $parser = Parse::RecDescent->new(<<'EOF_GRAMMAR'); parse : statement(s) /\s*\z/ { @{$item[1]} } # num stmts parsed statement : opcode | identifier opcode { $::label{$item[1]} = $#::code } #statement : opcode | label opcode #bad #label : identifier { $::label{$item[1]} = scalar @::code } opcode: IF | YMOVE | XMOVE | DOWN | UP | PRINTLOC | GOTO | SET | ADD | SUB | HALT YMOVE: /YMOVE/i int_or_var { push @::code, '$axis{Y} += ' . $item{int_or_var}; } XMOVE: /XMOVE/i int_or_var { push @::code, '$axis{X} += ' . $item{int_or_var}; } DOWN: /DOWN/i { push @::code, '$axis{Z}--'; } UP: /UP/i { push @::code, '$axis{Z}++'; } PRINTLOC: /PRINTLOC/i { push @::code, 'print "@{[map $axis{$_},sort keys %axis]}\n"'; } IF: /IF/i int_or_var relop int_or_var identifier { push @::code, "GOTO:".::fetchlabel($item{identifier}). ":" . join(' ',@item[2,3,4]); } GOTO: /GOTO/i identifier { push @::code, "GOTO:".::fetchlabel($item{identifier}).":1"; } SET: /SET/i identifier int_or_var { push @::code, '$symbol{' . $item{identifier} . '} = ' . $item{ +int_or_var}; } ADD: /ADD/i identifier int_or_var { push @::code, '$symbol{' . $item{identifier} . '} += ' . $item +{int_or_var}; } SUB: /SUB/i identifier int_or_var { push @::code, '$symbol{' . $item{identifier} . '} -= ' . $item +{int_or_var}; } HALT: /HALT/i { push @::code, 'exit(0)'; } int_or_var : /-*\d+/ { $item[1] } | identifier { ::fetchsymbol($item[1]) } relop : /[<>=!]=|[<>]/{ $item[1] } # relop : '<' | '<=' # bad identifier : /[a-zA-Z]\w*/ { $item[1] } EOF_GRAMMAR print "Label{label}=".( defined $label{label} ? $label{label} : "undef"),"\n"; while (<DATA>) { /^\s+$/ and last; $parser->parse($_) or warn "$_ failed parse\n"; } print "Label{label}=".( defined $label{label} ? $label{label} : "undef"),"\n"; for my $cur (0..$#code) { my $line = $code[$cur]; if (substr($line,0,4) eq 'GOTO') { my ($new,$true) = (split(':',$line,3))[1,2]; $cur = $new - 1 and next if eval $true; } #eval $code[$cur]; print "$code[$cur];\n"; } __DATA__ L2 set foo 10 label if foo 10 L2 set foo 10 ymove 1 L2 xmove 1 printloc IF FOO > 10 L2 halt
Update: Ok I read the latest code, and I like the improvements. You resolved at least a couple of the the issues I mention above. But some are still open and waiting to bite you.
Goodnight and HTH
--- demerphq
my friends call me, usually because I'm late....
In reply to Re: (jeffa) Re: Help with tweaking Parse::RecDescent grammar
by demerphq
in thread Help with tweaking Parse::RecDescent grammar
by jeffa
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |