in reply to Re: command input processing
in thread command input processing

The would be a slight improvement on what I have currently =), but most of my variability is having 10 different formats for say the log command so I still end up with a horrid list, maybe I can't escape that.

I'm expanding the interpreter to be a lot more featured so that it can process conditional things like...


if ( event >= other thing ) { do this; and maybe more things }

This makes the processing a bit more complicated so I guess I probably more curious as to how languages say perl, intepret language structure/grammar as I'm guessing they don't have some horrid long if loop

I suppose if I'm going to have say <25 permutations of commands only hardwiring the formats works, but making it reevaluate chunks in brackets as separate pieces starts to make life interesting and it would be nice to make it say the following...


if ( event1 > thing ) { data = var; goto func1 }
and if ( event1 > thing ) { goto func1; data = var; }

by just teaching it the formats...

if () {} data = var goto func

rather than limiting the system to the permutations I'd hard coded. I guess this running a bit deeper that the original question i posted, appologies for not being more specific the first time.


Regards Paul

Replies are listed 'Best First'.
Re^3: command input processing
by ikegami (Patriarch) on Oct 22, 2004 at 02:31 UTC
    I probably more curious as to how languages say perl, intepret language structure/grammar

    Ignore perlfunc functions for a moment. What you describe and perl don't compare. In your scenario, the script calls functions defined in the language itself. In perl, the script calls functions the script defines. Perl doesn't care what the function is called, or how many arguments it has, or anything, because all functions look the same to perl.

    Except, of course, builtin functions, which perlfunc functions may be. I don't know how that's done. C, C++ and Java don't have any builtin functions as far as I know. There's "new", but it's an operator. And yeah, there's surely a big code or data structure to parse the operators.

    What compilers do is make a tree. In compilers that create binaries or bytecode, the tree is serialized into instructions. perl keeps the program in tree form. For perl, executing the program is simply navigating the tree and taking actions based on the type of node it finds. Here's an example:

    >perl -MO=Terse -e "$a = $a + 1" LISTOP (0x18a01ec) leave [1] OP (0x18a01d0) enter COP (0x18a0210) nextstate BINOP (0x18a024c) sassign BINOP (0x18a0270) add [3] UNOP (0x18a02b4) null [15] PADOP (0x18a02d4) gvsv 2 SVOP (0x18a0294) const SPECIAL #0 Nullsv UNOP (0x18a02f4) null [15] PADOP (0x18a0314) gvsv 1 -e syntax OK

    It takes lots of code and/or lots of tables to build this tree. 387772 is a parser I wrote for a relatively simple grammar. Look how long it is, and the language it parses only has conditionals, loops, line continations, literals and variables, nothing else. There are tools to assist you, such as lex+yacc and Parser::RecDescent, but it's still a lot of work. (Coding is only a small portion of it, too. The amount of thought that should be put in the design of a language is enourmous, but that's off topic.)

      Ok.....So I've been toying around with some code to be able to parse multiple combinations of commands and here is my first crack at it..... I would be interesting in anyones thoughts...

      #!/usr/bin/perl use warnings; my %action_list = (); my %rules = ( Cmd_SetVariable => qw' ^\s*(\w+)+\s*\=\s*(\w+ +)\s*$ ', Cmd_IfThen => qw' ^\s*if\s*\(\s*(\w+)\s* +(>=?|<=?|==)\s*(\w+)\s*\)\s*\{\s*([\w\s\(\)=]+)\;?\s*\}$ ', Cmd_Chunk => qw' ^\s*\w+\s*$ ', Cmd_MultiChunk => qw' ^\s*(.*?\;){0,}\s*$ + ', ); sub AUTOLOAD { my $func = shift; print "Oops, handler not defined : $func\n"; } sub Cmd_SetVariable { print "Its a Set Variable Chunk\n"; } sub Cmd_IfThen { my ($this, $chunk) = @_; print "Its a If then Chunk\n"; my ($arg1, $op, $arg2, $do) = $chunk =~ /$rules{Cmd_IfThen}/; print "arg1: $arg1 op: $op arg2: $arg2 do: $do\n"; CheckChunk($do); } sub Cmd_MultiChunk { my ($this, $chunk) = @_; print "Its a Multi Chunk\n"; my @chunks = split ';',$chunk; foreach $c ( @chunks ) { CheckChunk($c); } } sub ViewList { print "\nActionList\n"; foreach my $key ( sort keys %action_list ) { my @commands = split ';', $action_list{$key}; print "$key:"; foreach my $v ( @commands ) { print "\t$v\n"; } } print "\n\n"; } sub AddAction { my $arg = shift; my ($action_name, $action) = split ':', $arg; $action_list{$action_name} = $action; } sub CheckChunk { my $command = shift; my $translated; print "\nProcessing : $command\n"; $translated = 0; foreach my $rulename ( keys %rules ) { if ( $command =~ /$rules{$rulename}/ ) { print "Gotcha :$rules{$rulename}\n"; &{$rulename}($rulename, $command); $translated = 1; } } if ( $translated != 1 ) { print "Syntax Error : $command\n"; } } sub CheckActions { foreach my $name ( sort keys %action_list ) { CheckChunk($action_list{$name}); } } AddAction("test: a=4;b =5 ; c = 9;if (b > a) { c = 100 };"); #AddAction("Camel: Camel = 5"); #AddAction("hippo: ilovetoswim"); #AddAction("ifthen1: if ( ABC > 500 ){XYZ = 10}"); #AddAction("ifthen2: if ( ABC < 500 ){sdfsdf}"); #AddAction("ifthen3: if ( ABC = 500 ){sdfsdf}"); #AddAction("ifthen4: if ( ABC == 500 ){sdfsdf}"); #AddAction("ifthen5: if ( ABC << 500 ){sdfsdf}"); #supposed to break #AddAction("ifthen6: if ( ABC >> 500 ){sdfsdf}"); #supposed to break #AddAction("ifthen7: if ( ABC <= 500 ){sdfsdf}"); #AddAction("ifthen8: if ( ABC >= 500 ){sdfsdf}"); ViewList(); CheckActions();
      OUTPUT:
      ActionList hippo: a=4 b =5 c = 9 if (b > a) { c = 100 } Processing : a=4;b =5 ; c = 9;if (b > a) { c = 100 }; Gotcha :^\s*(.*?\;){0,}\s*$ Its a Multi Chunk Processing : a=4 Gotcha :^\s*(\w+)+\s*\=\s*(\w+)\s*$ Its a Set Variable Chunk Processing : b =5 Gotcha :^\s*(\w+)+\s*\=\s*(\w+)\s*$ Its a Set Variable Chunk Processing : c = 9 Gotcha :^\s*(\w+)+\s*\=\s*(\w+)\s*$ Its a Set Variable Chunk Processing : if (b > a) { c = 100 } Gotcha :^\s*if\s*\(\s*(\w+)\s*(>=?|<=?|==)\s*(\w+)\s*\)\s*\{\s*([\w\s\ +(\)=]+)\;?\s*\}$ Its a If then Chunk arg1: b op: > arg2: a do: c = 100 Processing : c = 100 Gotcha :^\s*(\w+)+\s*\=\s*(\w+)\s*$ Its a Set Variable Chunk
      Regards Paul

        The language you're trying to parse now is much more complex than the one in your initial post. You'll soon end up with problems if you keep using your current method of trying to match the entire statement at once. Consider the following.

        1) Nested statemements. How will you handle if (...) { if (...) { ... } }?

        2) Order of operation. How will you handle a = b + c * d?

        3) Error reporting. How can you give a meaningful error message when at best, you only know if a whole statement is valid or not.

        Usually, it's not worth the effort to invent a new language. Embedding Perl sounds like a great idea if you're looking for a feature-full language. If you need to restrict access to certain functions, there's a module called "Safe" or "something::Safe" which allows you to do just that.

        For fun, what follows is a RecDescent parser for the grammar you had above. (It actually does a little bit more, but I wanted to keep it very close to yours. I could have made it identical, so don't think it's a limitation of the module. I did force assignments to end with semi-colons to simplify the grammar and keep it readable.)

        use strict; use warnings; use Data::Dumper (); use Parse::RecDescent (); { my $grammar = <<'__EOI__'; { use strict; use warnings; } # --- Tokens --- EOF : /^\Z/ IDENTIFIER : /[A-Za-z]\w*/ LITERAL : /\d+/ REL_OP : />=?|<=?|==/ EQUAL : '=' # --- Keywords --- IF_KEYWORD : IDENTIFIER { $item[1] eq 'if' ? $item[1] : und +ef } # --- Rules --- parse : stmt(s?) EOF { $item[1] } stmt : IF_KEYWORD <commit> if_rest { [ $item[1], @{$i +tem[3]} ] } | label { $item[1] } | assign ';' { $item[1] } | call ';' { $item[1] } | <error> if_rest : '(' compare ')' '{' stmt(s?) '}' { [ @item[2, + 5] ] } label : IDENTIFIER ':' { [ @item[0, 1] ] } assign : IDENTIFIER EQUAL term { [ @item[2, 1, 3] ] } call : IDENTIFIER { [ @item[0, 1] ] } compare : term REL_OP term { [ @item[2, 1, 3] ] } term : IDENTIFIER { [ 'identifier', $item[1] ] } | LITERAL { [ 'literal', $item[1] ] } __EOI__ $::RD_HINT = 1; # $::RD_TRACE = 1; # Parse::RecDescent::Hack->Precompile($grammar, "Module"); my $parser = Parse::RecDescent->new($grammar); die("Bad grammar.\n") unless defined($parser); my $text = <<'__EOI__'; test: a=4;b =5 ; c = 9; if (b > a) { c = 100; } Camel: Camel = 5; hippo: ilovetoswim; ifthen1: if ( ABC > 500 ){XYZ = 10;} ifthen2: if ( ABC < 500 ){sdfsdf;} ifthen4: if ( ABC == 500 ){sdfsdf;} ifthen7: if ( ABC <= 500 ){sdfsdf;} ifthen8: if ( ABC >= 500 ){sdfsdf;} __EOI__ my $result = $parser->parse(\$text); die("Bad text.\n") unless (defined($result)); print("Parse Tree\n"); print("==========\n"); print Data::Dumper::Dumper($result); }

        From here, it's easy to replace compare and assign with expr. And it's pretty simple to expand expr into something that handles numeous operators with varied precedence. Adding while would be trivial.

Re^3: command input processing
by Fletch (Bishop) on Oct 22, 2004 at 00:23 UTC

    And then you keep adding things and adding things and you've created your own language (c.f. PHP :). If you want a more full featured language you've got two options:

    • write your own interpreter (using something like Parse::RecDescent or byacc)
    • let your users just write perl and have a wrapper hide details like use MyLanguageModule qw( blah blah )

    If you've got users that can handle it I'd go for the later as it's less work for you and they don't have to learn yet another language.

      Embedding perl isn't a bad idea actually =)... Thanks, then perl can take care of the syntax etc. It will just be myself that is writing the commands/macros so that should not be a problem. Regards Paul