in reply to Re^3: command input processing
in thread command input processing

Ok.....So I've been toying around with some code to be able to parse multiple combinations of commands and here is my first crack at it..... I would be interesting in anyones thoughts...

#!/usr/bin/perl use warnings; my %action_list = (); my %rules = ( Cmd_SetVariable => qw' ^\s*(\w+)+\s*\=\s*(\w+ +)\s*$ ', Cmd_IfThen => qw' ^\s*if\s*\(\s*(\w+)\s* +(>=?|<=?|==)\s*(\w+)\s*\)\s*\{\s*([\w\s\(\)=]+)\;?\s*\}$ ', Cmd_Chunk => qw' ^\s*\w+\s*$ ', Cmd_MultiChunk => qw' ^\s*(.*?\;){0,}\s*$ + ', ); sub AUTOLOAD { my $func = shift; print "Oops, handler not defined : $func\n"; } sub Cmd_SetVariable { print "Its a Set Variable Chunk\n"; } sub Cmd_IfThen { my ($this, $chunk) = @_; print "Its a If then Chunk\n"; my ($arg1, $op, $arg2, $do) = $chunk =~ /$rules{Cmd_IfThen}/; print "arg1: $arg1 op: $op arg2: $arg2 do: $do\n"; CheckChunk($do); } sub Cmd_MultiChunk { my ($this, $chunk) = @_; print "Its a Multi Chunk\n"; my @chunks = split ';',$chunk; foreach $c ( @chunks ) { CheckChunk($c); } } sub ViewList { print "\nActionList\n"; foreach my $key ( sort keys %action_list ) { my @commands = split ';', $action_list{$key}; print "$key:"; foreach my $v ( @commands ) { print "\t$v\n"; } } print "\n\n"; } sub AddAction { my $arg = shift; my ($action_name, $action) = split ':', $arg; $action_list{$action_name} = $action; } sub CheckChunk { my $command = shift; my $translated; print "\nProcessing : $command\n"; $translated = 0; foreach my $rulename ( keys %rules ) { if ( $command =~ /$rules{$rulename}/ ) { print "Gotcha :$rules{$rulename}\n"; &{$rulename}($rulename, $command); $translated = 1; } } if ( $translated != 1 ) { print "Syntax Error : $command\n"; } } sub CheckActions { foreach my $name ( sort keys %action_list ) { CheckChunk($action_list{$name}); } } AddAction("test: a=4;b =5 ; c = 9;if (b > a) { c = 100 };"); #AddAction("Camel: Camel = 5"); #AddAction("hippo: ilovetoswim"); #AddAction("ifthen1: if ( ABC > 500 ){XYZ = 10}"); #AddAction("ifthen2: if ( ABC < 500 ){sdfsdf}"); #AddAction("ifthen3: if ( ABC = 500 ){sdfsdf}"); #AddAction("ifthen4: if ( ABC == 500 ){sdfsdf}"); #AddAction("ifthen5: if ( ABC << 500 ){sdfsdf}"); #supposed to break #AddAction("ifthen6: if ( ABC >> 500 ){sdfsdf}"); #supposed to break #AddAction("ifthen7: if ( ABC <= 500 ){sdfsdf}"); #AddAction("ifthen8: if ( ABC >= 500 ){sdfsdf}"); ViewList(); CheckActions();
OUTPUT:
ActionList hippo: a=4 b =5 c = 9 if (b > a) { c = 100 } Processing : a=4;b =5 ; c = 9;if (b > a) { c = 100 }; Gotcha :^\s*(.*?\;){0,}\s*$ Its a Multi Chunk Processing : a=4 Gotcha :^\s*(\w+)+\s*\=\s*(\w+)\s*$ Its a Set Variable Chunk Processing : b =5 Gotcha :^\s*(\w+)+\s*\=\s*(\w+)\s*$ Its a Set Variable Chunk Processing : c = 9 Gotcha :^\s*(\w+)+\s*\=\s*(\w+)\s*$ Its a Set Variable Chunk Processing : if (b > a) { c = 100 } Gotcha :^\s*if\s*\(\s*(\w+)\s*(>=?|<=?|==)\s*(\w+)\s*\)\s*\{\s*([\w\s\ +(\)=]+)\;?\s*\}$ Its a If then Chunk arg1: b op: > arg2: a do: c = 100 Processing : c = 100 Gotcha :^\s*(\w+)+\s*\=\s*(\w+)\s*$ Its a Set Variable Chunk
Regards Paul

Replies are listed 'Best First'.
Re^5: command input processing
by ikegami (Patriarch) on Oct 22, 2004 at 17:46 UTC

    The language you're trying to parse now is much more complex than the one in your initial post. You'll soon end up with problems if you keep using your current method of trying to match the entire statement at once. Consider the following.

    1) Nested statemements. How will you handle if (...) { if (...) { ... } }?

    2) Order of operation. How will you handle a = b + c * d?

    3) Error reporting. How can you give a meaningful error message when at best, you only know if a whole statement is valid or not.

    Usually, it's not worth the effort to invent a new language. Embedding Perl sounds like a great idea if you're looking for a feature-full language. If you need to restrict access to certain functions, there's a module called "Safe" or "something::Safe" which allows you to do just that.

    For fun, what follows is a RecDescent parser for the grammar you had above. (It actually does a little bit more, but I wanted to keep it very close to yours. I could have made it identical, so don't think it's a limitation of the module. I did force assignments to end with semi-colons to simplify the grammar and keep it readable.)

    use strict; use warnings; use Data::Dumper (); use Parse::RecDescent (); { my $grammar = <<'__EOI__'; { use strict; use warnings; } # --- Tokens --- EOF : /^\Z/ IDENTIFIER : /[A-Za-z]\w*/ LITERAL : /\d+/ REL_OP : />=?|<=?|==/ EQUAL : '=' # --- Keywords --- IF_KEYWORD : IDENTIFIER { $item[1] eq 'if' ? $item[1] : und +ef } # --- Rules --- parse : stmt(s?) EOF { $item[1] } stmt : IF_KEYWORD <commit> if_rest { [ $item[1], @{$i +tem[3]} ] } | label { $item[1] } | assign ';' { $item[1] } | call ';' { $item[1] } | <error> if_rest : '(' compare ')' '{' stmt(s?) '}' { [ @item[2, + 5] ] } label : IDENTIFIER ':' { [ @item[0, 1] ] } assign : IDENTIFIER EQUAL term { [ @item[2, 1, 3] ] } call : IDENTIFIER { [ @item[0, 1] ] } compare : term REL_OP term { [ @item[2, 1, 3] ] } term : IDENTIFIER { [ 'identifier', $item[1] ] } | LITERAL { [ 'literal', $item[1] ] } __EOI__ $::RD_HINT = 1; # $::RD_TRACE = 1; # Parse::RecDescent::Hack->Precompile($grammar, "Module"); my $parser = Parse::RecDescent->new($grammar); die("Bad grammar.\n") unless defined($parser); my $text = <<'__EOI__'; test: a=4;b =5 ; c = 9; if (b > a) { c = 100; } Camel: Camel = 5; hippo: ilovetoswim; ifthen1: if ( ABC > 500 ){XYZ = 10;} ifthen2: if ( ABC < 500 ){sdfsdf;} ifthen4: if ( ABC == 500 ){sdfsdf;} ifthen7: if ( ABC <= 500 ){sdfsdf;} ifthen8: if ( ABC >= 500 ){sdfsdf;} __EOI__ my $result = $parser->parse(\$text); die("Bad text.\n") unless (defined($result)); print("Parse Tree\n"); print("==========\n"); print Data::Dumper::Dumper($result); }

    From here, it's easy to replace compare and assign with expr. And it's pretty simple to expand expr into something that handles numeous operators with varied precedence. Adding while would be trivial.

      Ikegami, Thanks for the response, its just what i was after =). I realize the short comings of the version I wrote, but had to give it a go for the mental exercise, and I guess you'd have to add binary trees to your storage mechanism so that you can do anything useful like you say rather that just being able to know if there is a fault in an entire line (not that my C compiler is any more useful at times). That can wait though I've got the doco of a whole new module to wade through. Thanks again, I appreciate your insight. Regards Paul