in reply to Re: Parsing and converting Fortran expression
in thread Parsing and converting Fortran expression [solved]

I'm a big fan of Parse::Yapp myself, but I thought I'd try to convert a simple arithmetic parser I had to build a parse tree and then see what it took to generate the C code. It was just a fun little project.

After submitting it, I realized the parse tree, though cool, was not needed for such a simple problem, so the second one just generates C directly.

A possible choice is to use Parse::Yapp to generate the parse tree, and then my XXX::C subs to produce the code.

To kikuchiyo, I'd also suggest reading the "Which is "best"? paragraph above several times, it's a very good analysis of the choices and problems they present.

Btw, *all* parsers are sneaky and complicated and have many tricks and traps for the unwary. :(

  • Comment on Re^2: Parsing and converting Fortran expression

Replies are listed 'Best First'.
Re^3: Parsing and converting Fortran expression
by kikuchiyo (Hermit) on Aug 26, 2015 at 18:52 UTC

    I've tried to use Parse::RecDescent in the past, but IIRC the initial cost of trying to understand that module was quite high.

    In the meantime I've recalled that I've solved a similar problem with Parse::Lex earlier at $work, but by that time the solution earlier in the thread appeared and it was too good not to use.

    I agree that parsing is tricky, no wonder that there is a library's worth of theoretical work related to it, and several generation's worth of tools invented and re-invented by people who don't know said theoretical work (me included).

      As an example of parsing being tricky, note that your modification fails to handle precedence differences ( || less than && less than == less than < ). This may or may not be important for your particular task.

      Since it may be, and I don't care for Parse::RecDescent when many levels of precedence are involved, here's a Parse::Yapp version (with all the precedence stuff neatly packaged at the top).

      # fortran2c.yp - yapp version of converting FORTRAN to C %left '.and.' '.or.' %right '.not.' %nonassoc '.eq.' '.ne.' '.eqv.' '.neqv.' %nonassoc '.ge.' '.le.' '.gt.' '.lt.' %left '+' '-' %left '*' '/' %% exp : exp '+' exp { "(($_[1]) + ($_[3]))" } | exp '-' exp { "(($_[1]) - ($_[3]))" } | exp '*' exp { "(($_[1]) * ($_[3]))" } | exp '/' exp { "(($_[1]) / ($_[3]))" } | exp '.and.' exp { "(($_[1]) && ($_[3]))" } | exp '.or.' exp { "(($_[1]) || ($_[3]))" } | exp '.eq.' exp { "(($_[1]) == ($_[3]))" } | exp '.ne.' exp { "(($_[1]) != ($_[3]))" } | exp '.eqv.' exp { "(($_[1]) == ($_[3]))" } | exp '.neqv.' exp { "(($_[1]) != ($_[3]))" } | exp '.lt.' exp { "(($_[1]) < ($_[3]))" } | exp '.gt.' exp { "(($_[1]) > ($_[3]))" } | exp '.le.' exp { "(($_[1]) <= ($_[3]))" } | exp '.ge.' exp { "(($_[1]) >= ($_[3]))" } | '.not.' exp { "(!($_[2]))" } | '(' exp ')' { $_[2] } | 'NUM' { $_[1] } | 'NAME' '(' arglist ')' { $_[1] . $_[3] } ; arglist : exp { "[($_[1])-1]" } | arglist ',' exp { "[($_[3])-1]$_[1]" } ; %% use warnings; use strict; sub lex { /\G\s+/gc, return /\G((\d+(\.\d*)?|\.\d+)([Ee][-+]?\d+)?)/gc ? (NUM => $1) : /\G(\.(?:and|or|eqv?|ne|neqv|lt|gt|le|ge|not)\.)/gc ? $1 : /\G(\w+)/gc ? (NAME => $1) : /\G(\w+|.)/gc ? $1 : '' for $_[0]->YYData->{in}; } sub error { for ($_[0]->YYData->{in}) { substr $_, pos(), 0, '<-- HERE '; die "parse ERROR: $_\n"; } } my $code = '.not.foo(1,bar(2)+1,3).and.baz(4,5)'; @ARGV and $code = "@ARGV"; my $parser = new fortran2c; $parser->YYData->{in} = $code; my $answer = $parser->YYParse(yylex => \&lex, yyerror => \&error) or die "syntax error\n"; print "$answer\n"; __END__ # Makefile all: fortran2c.pl %.pl: %.yp yapp -v -s -b '/usr/bin/perl' -o $@ $< chmod +x $@

        Thanks for reminding me of operator precedence.

        Which is an even more complicated matter than it first looks because of reasons that go beyond mere parsing: in C operators like && and || short-circuit, while in Fortran they may or may not depending on the implementation. Luckily this might not matter with the somewhat restricted expressions I have.

        split

        %left '.and.' '.or.'

        into two lines

        %left '.or.' %left '.and.'

        (See how tricky parsing is :)

        Also, I'm using perl's precedences (which perlop says is the same as C's) instead of FORTRAN's, which I haven't bothered to look up :)

        If you do look them up and they differ, just remember "low precedence at top, higher precedence is lower" :) (yacc standard :)

        Also, I guessed that FORTRAN's .not. is a different precedence than C's !, it may be more like perl's "not" (or not :)

        (See how tricky parsing is :)

        Also note that I don't have to worry about C's precedences - the benefits of complete parenthesizing :)