bear_hwn has asked for the wisdom of the Perl Monks concerning the following question:

(yes, me again.) in an attempt to speed up execution of an assembler (based on a c-like syntax and using the Parse::RecDescent package), i have split out and precompiled my grammar as per the documented instructions. as a result, i now have a tool that runs a bit faster, but it acts differently (erroneously flagging errors) in the pre-compiled version vs. interpreted -- not something that i would necessarily have expected based on what i've read. (it probably means i've been sloppy.) i have R'd all the FM's i can find, but i'd sure appreciate any pointers to additional documentation and/or expertise base. i've already had the suggestion that i try something besides P::RD but i'd rather not have to abandon what i've done. thanks.
  • Comment on Parse::RecDescent precompiled grammar != interp'd

Replies are listed 'Best First'.
Re: Parse::RecDescent precompiled grammar != interp'd
by BrowserUk (Patriarch) on Jun 02, 2003 at 23:11 UTC

    I think that you may get better answers to your Parse::RecDescent questions if you try the P::RD mailing list.

    Note: I stole the above reference from one of Merlyns posts that I remembered seeing.


    Examine what is said, not who speaks.
    "Efficiency is intelligent laziness." -David Dunham
    "When I'm working on a problem, I never think about beauty. I think only how to solve the problem. But when I have finished, if the solution is not beautiful, I know it is wrong." -Richard Buckminster Fuller


      thanks very much for the pointer!
Re: Parse::RecDescent precompiled grammar != interp'd
by jepri (Parson) on Jun 02, 2003 at 23:01 UTC
    It might be worth checking if you use global variables or namespace variables in your grammar rules. Are you setting them up and handling them the same way you did before you compiled your grammar?

    How bad is it to live with compiling the grammar at the start of each run? I usually try to keep it in memory anyway, so the startup cost isn't too bad.

    And yes, we would need code, or at least the text of the error messages to help you further.

    ____________________
    Jeremy
    I didn't believe in evil until I dated it.

      yes, some globals are used, e.g. to construct the instruction words for the machine and keep track of address and lines, etc. these should not have any effect on the parsing, methinks. i've just made a reply, with some code and output samples, to revdiablo. as for startup cost, the beastie is _terribly_ slow. precompiling helps a bit, but what i'd really like to do is to get perlcc to be happy with it, but alas, that's a whole different can of worms: it has serious complaints and gives errors like "compile failed...that cannot happen!" :-/
Re: Parse::RecDescent precompiled grammar != interp'd
by revdiablo (Prior) on Jun 02, 2003 at 22:19 UTC

    I don't know much about Parse::RecDescent, but as a preemptive strike, I'm almost sure someone who does will ask for the code in question. Perhaps you should put together a snippet that demonstrates the problem and post it before someone else sees your question. ;)

      i thought about providing code, but figured it'd be considerably more than most would care to wade. ok, you asked for it. :-) here's the synopsis: blank lines are erroneously flagged as errors in the precomiled version and the resync seems to cause skipped lines -- see the first instruction is skipped in the .pm case. (notes (1) the grammar keeps track of its successes and prints that as hints in case of failure; (2) the hardware repeat is 1 more than value coded into instruction, so the assembler decrements user provided value.)
      
      correct output from fully interpreted version:
      
      
      1: // test cases. 2: 3: // repeat instruction 4: [0000] 000000f70000 5: repeat = 0x0001; 6: repeat = 0x0002; [0001] 000000f70001 7: repeat = 0xfade; [0002] 000000f7fadd
      incorrect output from same input when using precompiled grammar:
      1: // test cases. 2: INVALID INSTRUCTION DETECTED. I got as far as finding ...i had just started! (going for anything i might recognise) my last success ended at LINE 1 COL 1. You might have a look at that last line, or at this line: 3: // repeat instruction skip 1 address, try to continue... 3: // repeat instruction 4: 5: repeat = 0x0001; 6: repeat = 0x0002; [0001] 000000f70001 7: repeat = 0xfade; [0002] 000000f7fadd
      at a high level, here is the bare bones of the code. first is the interpreted version, later is as changed for use with the precompiled grammar as a .pm file:
      #interp'd $mach_grammar = q { [...grammar_contents...] } main { my $mach_parser = new Parse::RecDescent ($mach_grammar); #[...] defined $ns85_parser->program_body($src_code) or die "bad grammar!"; }
      ok, that was the interpreted, now for the precompiled:
      #precompiled use mach_grammar; main { my $mach_parser = mach_grammar->new(); #[...] defined $mach_parser->program_body($src_code) or die "bad grammar!"; }
      in more detail, below is substantially more of the code. the changes are, as noted, very small and a lot harder to see in it all. first is the interpreted version, later is as changed for use with the precompiled grammar as a .pm file. the debug option has been set in each case, so that internal test data is used and the input file (spec'd on the command line) is ignored.
      #interp'd require "/usr/lib/perl5/5.6.1/getopts.pl"; #[...] our @prog_src; our $src_code; our $opt_i; use vars qw($opt_l); # L is used in actions inside the grammar # the machine has 48 bit instruction words, which are built up # in two halves. use vars qw($Instruction_upper); use vars qw($Instruction_lower); use vars qw($Instruction_address); use vars qw($LastPrintedLine); use vars qw($LineToPrint); use vars qw($LastSuccess); use vars qw($LookingFor); #[...] $mach_grammar = q { program_body : instruction(s) program_body { $return = 1; } | instruction { $return = 1; } | end_of_input { print sprintf "\nend of source at LINE %d\n", $thisline; exit; } | bad_instr : { print sprintf "\nINVALID INSTRUCTION DETECTED."; print sprintf "\n\n%s %s\n (%s %s)\n", "I got as far as finding", $main::LastSuccess, "going for", $main::LookingFor; $main::Instruction_address += 1; print sprintf "\nmy last success ended at LINE %d COL %d." +, $thisline, $thiscolumn; print sprintf "\nYou might have a look at that"; print sprintf "\n last line, or at this line:"; print "\n\n "; print sprintf "%d: %s", $main::LastPrintedLine + 2, @main::prog_src[$main::LastPrintedLine+1]; print sprintf "\nskip 1 address, try to continue...\n"; } <resync:[^;]*[;]> #[....major snippage...] hexval : /0x[0-9A-F]+/i { $return = hex $item[1]; 1; } }; # end of grammar # # main program follows # { my $mach_parser = new Parse::RecDescent ($mach_grammar); #[...] # acquire @prog_src if ($debug == 1) { @prog_src = <DATA> or die "ERROR: cannot read input data."; } else { # open the input file open(INFILE, "$opt_i") || die "ERROR: Cannot open $opt_i"; @prog_src = <INFILE> or die "ERROR: cannot read input file."; close(INFILE); } # # mash all the source together and remove all the newlines. # foreach (0 .. scalar(@prog_src) - 1) { $prog_src[@_] =~ s/\\r/\$\//g ; } push @prog_src, "\n__EOF__"; $src_code = join " ", @prog_src; defined $mach_parser->program_body($src_code) or die "bad grammar!"; } # # ****************************************************************** # main program section has ended, debug data section follows. # ****************************************************************** # __DATA__ // test cases. // repeat instruction repeat = 0x0001; repeat = 0x0002; repeat = 0xfade; //[...rest of test cases removed...] /* that's all folks.... */ __EOF__
      and the precompiled usage has the grammar all off in another file, precompiled to mach_grammar.pm and the main code then looks like:
      require "/usr/lib/perl5/5.6.1/getopts.pl"; #[...] use mach_grammar; our @prog_src; our $src_code; our $opt_i; use vars qw($opt_l); # L is used in actions inside the grammar # the machine has 48 bit instruction words, which are built up # in two halves. use vars qw($Instruction_upper); use vars qw($Instruction_lower); use vars qw($Instruction_address); use vars qw($LastPrintedLine); use vars qw($LineToPrint); use vars qw($LastSuccess); use vars qw($LookingFor); # # main program follows # { my $mach_parser = mach_grammar->new(); #[...] # acquire @prog_src if ($debug == 1) { @prog_src = <DATA> or die "ERROR: cannot read input data."; } else { # open the input file open(INFILE, "$opt_i") || die "ERROR: Cannot open $opt_i"; @prog_src = <INFILE> or die "ERROR: cannot read input file."; close(INFILE); } # # mash all the source together and remove all the newlines. # foreach (0 .. scalar(@prog_src) - 1) { $prog_src[@_] =~ s/\\r/\$\//g ; } push @prog_src, "\n__EOF__"; $src_code = join " ", @prog_src; defined $mach_parser->program_body($src_code) or die "bad grammar!"; } # # ****************************************************************** # main program section has ended, debug data section follows. # ****************************************************************** # __DATA__ // test cases. // repeat instruction repeat = 0x0001; repeat = 0x0002; repeat = 0xfade; //[...rest of test cases removed...] /* that's all folks.... */ __EOF__
        the "skipping a line" that i noticed is probably due to the
        <resync:[^;]*[;]>
        which gets invoked upon failure. it will consume up through the next semicolon: usually the next instruction past the blank line just flagged as an error.