comment on

Sorry for removing the Tk code. It was easier for me to test it this way.

I wrote this program because I needed to figure out how to give relevant error messages using Parse::RecDescent. Don't count on us doing this much work for you in the future.

#!/usr/bin/perl

use strict;
use warnings;

use Data::Dumper      ();
use Parse::RecDescent ();

our $test_num;

sub do_test {
   my $parser = $_[0];
   our $text; local *text = \$_[1];  # Pass by ref.

   ++$test_num;

   print("========================\n");
   print("\n");

   print("Test $test_num\n");
   print("======\n");
   print("\n");

   print("TEXT\n");
   print("----\n");
   print($text);
   print("\n");

   print("OUTPUT\n");
   print("------\n");
   my $data = $parser->parse($text);

   if ($data) {
      print(Data::Dumper::Dumper($data));
   } else {
      warn("Bad text.\n");
   }

   print("\n");
}


my $grammar = <<'__EOI__';

   ########################################
   # Helper Code
   #

   {
      sub OK     () { ''    }
      sub REJECT () { undef }

      my %COMMANDS = map { $_ => $_ } qw(
         print
      );
      
      # XXX Global bad.
      # Should be an attribute of the parser.
      my $in_code;

      sub dequote {
         local $_ = @_ ? $_[0] : $_;
         s/^['"]//;
         s/['"]$//;
         s/\\(.)/$1/gs;
         return $_;
      }
   }


   ########################################
   # Rules
   #

   parse      : SET_IN_JUNK parse_ { $item[2] }

   parse_     :         stmt_list EOF    { $item[1] }
              | IN_JUNK JUNK_LINE parse_ { $item[3] }

   stmt_list  : stmt SET_IN_CODE sep stmt_list_(s?)
                { [ $item[1], @{$item[4]} ] }

   stmt_list_ : stmt sep { $item[1] }

   stmt       : cmd_name arg_list { [ $item[1], @{$item[2]} ] }

   cmd_name   : IDENT { $COMMANDS{$item[1]} }
              | /[^ \t\n;]{1,20}/
                {
                   my $idx  = 1;
                   my $line = $itempos[$idx]{line}{from};
                   my $col  = $itempos[$idx]{column}{from};
                   my $cmd  = $item[$idx];
                   warn("$line,$col: Unknown command \"$cmd\".\n")
                      if ($in_code);
                   REJECT
                }

   arg_list   : QUOTED_STR { [ $item[1] ] }
              | {
                   my $idx  = 1;
                   my $line = $itempos[$idx]{line}{from};
                   my $col  = $itempos[$idx]{column}{from};
                   warn("$line,$col: Bad argument list.\n");
                   REJECT
                }

   sep        : SEP
              | {
                   my $idx  = 1;
                   my $line = $itempos[$idx]{line}{from};
                   my $col  = $itempos[$idx]{column}{from};
                   warn("$line,$col: Expecting ';'.\n");
                   REJECT
                }


   ########################################
   # Tokens
   #

   EOF        : /\Z/
   JUNK_LINE  : /[^\n]*\n/
   SEP        : ';'
   IDENT      : /[a-zA-Z][a-zA-Z0-9-_]*/
   QUOTED_STR : /'(?:[^'\\]|\\.)*'/ { dequote($item[1]) }
              | /"(?:[^"\\]|\\.)*"/ { dequote($item[1]) }


   ########################################
   # Special
   #

   SET_IN_JUNK : { $in_code = 0; OK }
   SET_IN_CODE : { $in_code = 1; OK }
   IN_JUNK     : { $in_code ? REJECT : OK }

__EOI__


$::RD_ERRORS = 1;
$::RD_WARN   = 1;
$::RD_HINT   = 1;
$::RD_TRACE  = undef;

# Make sure STDOUT and STDERR mix correctly.
select(STDOUT); $|=1;
select(STDERR); $|=1;
select(STDOUT);

my $parser = Parse::RecDescent->new($grammar)
or die("Bad Grammar.\n");

do_test($parser, <<'__EOI__');
junk junk junk
junk junk print junk junk
junk junk junk
print 'foo';
print 'bar';
__EOI__

do_test($parser, <<'__EOI__');
junk junk junk
junk junk print junk junk
junk junk junk
print 'boo';
junk junk
__EOI__

do_test($parser, <<'__EOI__');
junk junk junk
junk junk print junk junk
junk junk junk
print '\'moo\'';
__EOI__

do_test($parser, <<'__EOI__');
junk junk junk
junk junk print junk junk
junk junk junk
print 'Hello';
print;
__EOI__

print("========================\n");
[download]

Notes:

I wrote it so more commands could be added easily.
I left argument checking out of the parser to simplify extending the language.
It allows more than one command per line, as long as they are seperated by semi-colons.
It gives very useful error messages.

However,

The semi-colon is required.
Commands ('print') are case-sensitive.
Changing $COMMANDS{$item[1]} to $COMMANDS{lc($item[1])} makes it case-insensitive.

Output:

========================

Test 1
======

TEXT
----
junk junk junk
junk junk print junk junk
junk junk junk
print 'foo';
print 'bar';

OUTPUT
------
$VAR1 = [
          [
            'print',
            'foo'
          ],
          [
            'print',
            'bar'
          ]
        ];

========================

Test 2
======

TEXT
----
junk junk junk
junk junk print junk junk
junk junk junk
print 'boo';
junk junk

OUTPUT
------
5,1: Unknown command "junk".
Bad text.

========================

Test 3
======

TEXT

----
junk junk junk
junk junk print junk junk
junk junk junk
print '\'moo\'';

OUTPUT
------
$VAR1 = [
          [
            'print',
            '\'moo\''
          ]
        ];

========================

Test 4
======

TEXT
----
junk junk junk
junk junk print junk junk
junk junk junk
print 'Hello';
print;

OUTPUT
------
5,6: Bad argument list.
Bad text.

========================
[download]

To make semi-colons optional:

For personal reference.
Not fully tested.

Change:
   stmt_list  : SEP(s?) stmt SET_IN_CODE stmt_list_(s?) SEP(s?)
                { [ $item[2], @{$item[4]} ] }

Change:
   stmt_list_ : seps stmt { $item[2] }

Add:
   seps       : SEP(s)
              | {
                   my $idx  = 1;
                   my $line = $itempos[$idx]{line}{from};
                   my $col  = $itempos[$idx]{column}{from};
                   warn("$line,$col: Expecting ';'.\n");
                   REJECT
                }

Remove:
    sep
[download]

In reply to Re: Parse::Recdescent rule problem by ikegami
in thread Parse::Recdescent rule problem by hak0983

Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!

Titles consisting of a single word are discouraged, and in most cases are disallowed outright.

Read Where should I post X? if you're not absolutely sure you're posting in the right place.

Please read these before you post! —

Posts may use any of the Perl Monks Approved HTML tags:

a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr

You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)

	For:		Use:
	&		`&`
	<		`<`
	>		`>`
	[		`[`
	]		`]`

Link using PerlMonks shortcuts! What shortcuts can I use for linking?

See Writeup Formatting Tips and other pages linked from there for more info.