•Re: Solution: Parse::RecDescent and mini-language parsing

Replies are listed 'Best First'.
Re: •Re: Solution: Parse::RecDescent and mini-language parsing by Flame (Deacon) on Apr 05, 2003 at 02:23 UTC
Ok, I ran the benchmark, and it seems the regex form is faster. I have tested it several times with the following code and results: use Parse::RecDescent; use Date::Calc qw(:all); use Benchmark ':all'; use strict; use warnings; my $grammar1 = q~ logic: expression eod { $return = $item[1]; } expression: <leftop: term termop term> termop: /and/i \| /xor/i \| /or/i term: '(' <commit> expression ')' { $return = $item[3]; } #[@item[1,3,4]]; } # Only include eleme +nts important to later processing \| condition condition: element comparison element { $return = main::process(@item[1..3]); } element: '<' <commit> /-?\w+/ '>' { $return = "<$item[3]>"; } #Return this so that the conditio +n value can be set \| /\d+/ # num is automatically returned comparison: /(=[><]=)/ <commit> <error: Unable to match comparison, + $1> \| /=?[><]=?/ \| '=' \| '!=' eod: /^\Z/ ~; my $grammar2 = q~ logic: expression eod { $return = $item[1]; } expression: <leftop: term termop term> termop: /and/i \| /xor/i \| /or/i term: '(' <commit> expression ')' { $return = $item[3]; } #[@item[1,3,4]]; } # Only include eleme +nts important to later processing \| condition condition: element comparison element { $return = main::process(@item[1..3]); } element: '<' <commit> /-?\w+/ '>' { $return = "<$item[3]>"; } #Return this so that the conditio +n value can be set \| /\d+/ # num is automatically returned comparison: '<=' \| '<' \| '=' \| '>=' \| '>' \| '!=' eod: /^\Z/ ~; my $parser1 = new Parse::RecDescent($grammar1) or die; my $parser2 = new Parse::RecDescent($grammar2) or die; my $test = '<DAY> = 4 or <DAY> > 4 or <DAY> < 4 or <DAY> >= 4 or <DAY> + <= 4 or <DAY> != 4'; cmpthese(10000,{ 'regex' => sub { $parser1->logic($test); }, 'quote' => sub { $parser2->logic($test); }, }); [download] Yielded: `Rate quote regex quote 116/s -- -5% regex 123/s 6% --` [download] This one added the =>= which I wanted to avoid, and while it slowed down both slightly, the regex was still in the lead. use Parse::RecDescent; use Date::Calc qw(:all); use Benchmark ':all'; use strict; use warnings; my $grammar1 = q~ logic: expression eod { $return = $item[1]; } expression: <leftop: term termop term> termop: /and/i \| /xor/i \| /or/i term: '(' <commit> expression ')' { $return = $item[3]; } #[@item[1,3,4]]; } # Only include eleme +nts important to later processing \| condition condition: element comparison element { $return = main::process(@item[1..3]); } element: '<' <commit> /-?\w+/ '>' { $return = "<$item[3]>"; } #Return this so that the conditio +n value can be set \| /\d+/ # num is automatically returned comparison: /(=[><]=)/ <commit> <error: Unable to match comparison, + $1> \| /=?[><]=?/ \| '=' \| '!=' eod: /^\Z/ ~; my $grammar2 = q~ logic: expression eod { $return = $item[1]; } expression: <leftop: term termop term> termop: /and/i \| /xor/i \| /or/i term: '(' <commit> expression ')' { $return = $item[3]; } #[@item[1,3,4]]; } # Only include eleme +nts important to later processing \| condition condition: element comparison element { $return = main::process(@item[1..3]); } element: '<' <commit> /-?\w+/ '>' { $return = "<$item[3]>"; } #Return this so that the conditio +n value can be set \| /\d+/ # num is automatically returned comparison: '<=' \| '<' \| '=' \| '>=' \| '>' \| '!=' eod: /^\Z/ ~; my $parser1 = new Parse::RecDescent($grammar1) or die; my $parser2 = new Parse::RecDescent($grammar2) or die; my $test = '<DAY> = 4 or <DAY> > 4 or <DAY> < 4 or <DAY> >= 4 or <DAY> + <= 4 or <DAY> != 4 or <DAY> =<= 4'; cmpthese(10000,{ 'regex' => sub { $parser1->logic($test); }, 'quote' => sub { $parser2->logic($test); }, }); [download] Yielded: `Rate quote regex quote 104/s -- -7% regex 113/s 8% --` [download] I admit, neither is as fast as I would like, but it certainly appears that the regex is the fastest method there, unless I made a mistake. Edit: If you feel this was biased in any way, feel free to suggest another string, or another test altogether. I don't use the benchmark module frequently, and I may have inadevertently allowed for some bias. My code doesn't have bugs, it just develops random features. Flame ~ Lead Programmer: GMS (DOWN) \| GMS (DOWN)	[reply] [d/l] [select]
•Re: Re: •Re: Solution: Parse::RecDescent and mini-language parsing by merlyn (Sage) on Apr 05, 2003 at 04:17 UTC
OK, so alternations are not as fast as I'd like {grin}. Try this... it means the same thing, but in one regex: `comparison: / <=? \| = \| >=? \| != /x` [download] It's important to understand that a regex is matched left-to-right for alternatives, so you have precise control over the possible matches. -- Randal L. Schwartz, Perl hacker Be sure to read my standard disclaimer if this is a reply.	[reply] [d/l]
Re: •Re: Solution: Parse::RecDescent and mini-language parsing by Flame (Deacon) on Apr 05, 2003 at 01:21 UTC
Valid points both, perhaps I should do a benchmark and see which of our methods is faster. It was my goal to decrease the number of or's in the production, but perhaps regex's aren't the way to go. As for the nature of the first line, I wanted to be sure that no one ever wrote =>= to simplify later processing, though looking at it now I don't think it would make much of a difference, but it does allow me to specify what the error was. As for the comments, this is part of a school project, and while the comments do need to be cleaned up, they will help me to explain in a hurry the purpose of each elment to the person reviewing it. I'll look into those benchmarks and, if I remember, post them here soon... too much to do, too little time... My code doesn't have bugs, it just develops random features. Flame ~ Lead Programmer: GMS (DOWN) \| GMS (DOWN)	[reply]