If the duration of execution is related to the building or execution of the grammar by Parse::RecDescent, you may want to take a look at some of the notes in the Parse::RecDescent documentation relating to pre-compiling parsers. This approach is likely to be the least-painful method by which to improve your grammar performance. From the Parse::RecDescent documentation ...
-
Normally Parse::RecDescent builds a parser from a grammar at run-time. That approach simplifies the design and implementation of parsing code, but has the disadvantage that it slows the parsing process down - you have to wait for Parse::RecDescent to build the parser every time the program runs. Long or complex grammars can be particularly slow to build, leading to unacceptable delays at start-up.
To overcome this, the module provides a way of "pre-building" a parser object and saving it in a separate module. That module can then be used to create clones of the original parser.
A grammar may be precompiled using the Precompile class method. For example, to precompile a grammar stored in the scalar $grammar, and produce a class named PreGrammar in a module file named PreGrammar.pm, you could use:
use Parse::RecDescent;
Parse::RecDescent->Precompile($grammar, "PreGrammar");
The first argument is the grammar string, the second is the name of the class to be built. The name of the module file is generated automatically by appending ".pm" to the last element of the class name. Thus
Parse::RecDescent->Precompile($grammar, "My::New::Parser");
would produce a module file named Parser.pm.
It is somewhat tedious to have to write a small Perl program just to generate a precompiled grammar class, so Parse::RecDescent has some special magic that allows you to do the job directly from the command-line.
If your grammar is specified in a file named grammar, you can generate a class named Yet::Another::Grammar like so:
> perl -MParse::RecDescent - grammar Yet::Another::Grammar
This would produce a file named Grammar.pm containing the full definition of a class called Yet::Another::Grammar. Of course, to use that class, you would need to put the Grammar.pm file in a directory named Yet/Another, somewhere in your Perl include path.
Having created the new class, it's very easy to use it to build a parser. You simply use the new module, and then call its new method to create a parser object. For example:
use Yet::Another::Grammar;
my $parser = Yet::Another::Grammar->new();
Furthermore, the Parse::RecDescent::FAQ documentation module includes a number of methods by which grammars can be optimised, often with great resulting improvements in grammar performance. Most notably:
- Reduce the "depth" of the grammar. Use fewer levels of nested subrules.
- Where possible, use regex terminals instead of subrules.
- Where possible, use string terminals instead of regexes.
- Use repetitions or <leftop>/<rightop> instead of recursion.
- Factor out common prefixes in a set of alternate productions.
- Pre-parse the input somehow to break it into smaller sections. Parse each section separately.
- Precompile they grammar. This won't speed up the parsing, but it will speed up the parser construction for big grammars (which Really Big Files often require).
perl -le 'print+unpack"N",pack"B32","00000000000000000100000010111010"' | [reply] [d/l] [select] |
thanks very much for the notes on precomilation.
i will give that a try as my first step.
| [reply] |
s/^$syntax_I_recognize// && do { &Action_for_syntax };
And it does this for everything, over and over again. This "nibbling" approach
is great for short strings, but for long strings, there's
a whole lot of string copying going on.
theDamian said that a significant speedup would have been realized by changing these to:
/\G$syntax_I_recognize/gc && do { &Action_for_syntax };
and letting pos() walk through the string, but alas never got the time to change P::RD into Parse::FastDecent (this would have been the basic change).
-- Randal L. Schwartz, Perl hacker
Be sure to read my standard disclaimer if this is a reply. | [reply] [d/l] [select] |
it's not a classic assembler syntax and my language is not
line based: the machine i am programming has several
compute units and multiple address units going
simultaneously, so i adopted a c-like syntax where
whitespace doesn't count. (5-8 lines of source code per
instruction are not uncommon.) hopefully, the pre-compile
approach will buy me decent speed up. many thanks to all!
| [reply] |
the pre-compilation of the grammar helps quite a bit: the
startup time is much shorter, though still long, and i can
now see the processing time per instruction. with my grammar
separated for precompilation, i decided to try perlcc again
on my little bit of user interface and the inclusion of the
precompiled grammar module - the result was the same sort of "it
failed to compile, but that is not possible!" error message.
does anybody know of a documented case where Parse::RecDescent
has actually been compiled with perlcc, or am i chasing a
willo' the wisp? thanks in advance.
| [reply] |
firstly, try the above parse::recdescent optimizations, if those still do not help, then you might want to use another recursive descent parser, one as flexible and easy to use as Parse::recdescent, just faster.
for me the C/C++ version of antlr, by terrence parr, is the best example of this. instead of wrestling with perlcc, take it for a spin. it's still available, it's fast & well documented, easy to build.
...wufnik
-- in the world of the mules there are no rules --
| [reply] |
i'll bear in mind the package you suggested. first, however
is the pre-compile attempt. many thanks!
| [reply] |