Parse::RecDescent for simple syntax-directed translation

tomazos has asked for the wisdom of the Perl Monks concerning the following question:

I'm trying to work out how to use Parse::RecDescent and am a little overwhelmed.

I can see how to use it as a recognizer, but to do a simple syntax-directed translation, all of the *items, $return, $text, real return value, etc, stuff has thrown me off.

Let's say you want a translator for languages over ('a','b','c','d','e') such that any balanced pairs of equal length sequences of a's and b's are replaced by c's and d's.

my $start = "eeeeaaaabbbeeee";
my $end = translate($start);
# $end eq "eeeeacccdddeeee"  (aaabbb -> cccddd)
[download]

(Side note: This is the "classic example" of something that can't be done with Update: regular grammars, and needs a context-free grammar. This is because within the regexp /(a*)(b*)/ there is no way to assert that length($1) == length($2).)

The syntax-directed translation (in pseudo-code) would be:

start -> part(s)    { start.t := join ('', part(s).t) }

part -> AnB         { part.t := AnB.t }
part -> 'a'         { part.t := 'a' }
part -> /[^a]+/     { part.t := /[^a]+/.t }

AnB -> 'a' AnB 'b'  { AnB.t := 'c' . AnB.t . 'd' }
AnB -> 'ab'         { AnB.t := 'cd' }
[download]

Any ideas on how this translates into Parse::RecDescent? Is there a more appropriate parsing module to use for cases where the input language is very similiar to the output language?

-Andrew.

Comment on Parse::RecDescent for simple syntax-directed translation Select or Download Code

Replies are listed 'Best First'.
Re: Parse::RecDescent for simple syntax-directed translation by Limbic~Region (Chancellor) on Jun 18, 2006 at 15:16 UTC
tomazos, (This is the "classic example" of something that can't be done with regular expressions, and needs a context-free grammar. This is because within the regexp /(a)(b)/ there is no way to assert that length($1) == length ($2). Anyhoo...) While this is probably strictly true, Perl is all about letting you get the job done: `my $str = "eeeeaaaabbbeeee"; $str =~ s/((a+)(??{'b'x length$2}))/'c' x (length($1) * .5) . 'd' x (l +ength($1) * .5)/e;` [download] Anyhoo...I have just recently started learning Parse::RecDescent. Update: I am not sure if this is what you had in mind, but the following accomplishes what you want without using sneaky experimental regex features. #!/usr/bin/perl use strict; use warnings; use Parse::RecDescent; $Parse::RecDescent::skip = ''; my $grammar = q{ match : PREFIX TOKEN SUFFIX {print join '', @item[1..3]} PREFIX : /.?(?=a+b+)/ TOKEN : /a+b+/ { my $str = $item[1]; my $a_cnt = $str =~ tr/a//; my $b_cnt = $str =~ tr/b//; if ($a_cnt == $b_cnt) { $return = ('c' x $a_cnt) . ('d' x $b_cnt); } elsif ($a_cnt > $b_cnt) { $return = ('a' x ($a_cnt - $b_cnt)) . ('c' x $b_c +nt) . ('d' x $b_cnt); } else { $return = ('c' x $a_cnt) . ('d' x $a_cnt) . ('b' +x ($b_cnt - $a_cnt)); } } SUFFIX : /.$/ }; my $parser = Parse::RecDescent->new($grammar); $parser->match('sing aaaaaabbb song'); [download] A lot of this code could be simplified and improved. I am neither a regex nor Parse::RecDescent guru. I did show how either could work. Once you have a string of a's followed by one or more b's (`$item[1]`), you only needed to calculate the desired string and assign it to $return. An explicit assignment to $return is not necessary as you could just let the last expression be returned as with Perl's subroutines. Cheers - L~R	[reply] [d/l] [select]
Re^2: Parse::RecDescent for simple syntax-directed translation by tomazos (Deacon) on Jun 18, 2006 at 18:17 UTC
Thanks for answering my question. The regex solution is cool. I should have qualified my statement by saying that standard regular grammars cannot handle this sort of pattern, whereas Perl's regexes can do everything and anything. In fact embedded actions within a Perl regex can do anything Perl can do - Therefore Perl regex's can do anything Perl can do. :) I guess I can see from your use of Parse::RecDescent, Update: And from re-reading the very long manual last night, that the answer to my question is something like: `my $grammar = q{ match : part(s) { print join('', @{$item[1]}) } part : AnB part : 'a' part : /[^a]+/ AnB : 'a' AnB 'b' { 'c' . $item[2] . 'd' } AnB : 'ab' { 'cd' } }` [download] -Andrew.	[reply] [d/l]
Re^3: Parse::RecDescent for simple syntax-directed translation by ikegami (Patriarch) on Jun 20, 2006 at 23:52 UTC
Close. Your grammar strips out all whitespace. Replace `match : part(s) { print join('', @{$item[1]}) }` with `match : <skip:''> part(s) { print join('', @{$item[2]}) }` A slight improvement is to replace `match : <skip:''> part(s) { print join('', @{$item[2]}) }` with `process : <skip:''> part(s) { join('', @{$item[2]}) }` so you can do `$filter = Parse::RecDescent->new($grammar);` `print $filter->process('eeeeaaaabbbeeee');`	[reply] [d/l] [select]

Back to Seekers of Perl Wisdom