comment on

It's much easier to parse a structure if you mark the end types as well as the start. You could do something like the following:

use strict;
use warnings;
use Data::Dumper;

my $structure = main(join '',<DATA>);
print Dumper($structure);

sub main {
    my ($s, $c, %hash) = $_[0];
    while ($s =~ /START PAGE(?: (\w+))?\s+(.*?)\s+END PAGE/gs) {  
        $hash{$1 ? $1 : ++$c} = page($2);
    }
    return \%hash;
}

sub page {
    my ($s, $c, %hash) = $_[0];
    while ($s =~ /START QUESTION(?: (\w+))?\s+(.*?)\s+END QUESTION/gs)
+ {
        $hash{$1 ? $1 : ++$c} = question($2);
    }
    return \%hash;
}

sub question {
    my ($s, %hash) = $_[0];
    ($hash{'label'}) = $s =~ /LABEL (.*)/;
    $s =~ /START CHOICES\s+(.*?)\s+END CHOICES/s;
    for (split / *\n */, $1) {
        push @{$hash{'choices'}}, [split / /, $_, 2];
    }
    return \%hash;
}

__DATA__
START PAGE p1
    START QUESTION 4B
        LABEL Do you like your pie with ice cream?
        START CHOICES
            1 Yes
            2 No
        END CHOICES
    END QUESTION
    START QUESTION 4C
        LABEL Do you like your pie with whipped cream?
        START CHOICES
            1 Yes
            2 No
        END CHOICES
    END QUESTION
END PAGE
[download]

I've made the choices into arrays instead of hashes, to preserve order.

In reply to Re: Parsing a macro language by TedPride
in thread Parsing a macro language by bluetrust

Are you posting in the right place? Check out Where do I post X? to know for sure.
Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
<code> <a> <b> <big> <blockquote> <br /> <dd> <dl> <dt> <em> <font> <h1> <h2> <h3> <h4> <h5> <h6> <hr /> <i> <li> <nbsp> <ol> <p> <small> <strike> <strong> <sub> <sup> <table> <td> <th> <tr> <tt> <u> <ul>
Snippets of code should be wrapped in <code> tags not <pre> tags. In fact, <pre> tags should generally be avoided. If they must be used, extreme care should be taken to ensure that their contents do not have long lines (<70 chars), in order to prevent horizontal scrolling (and possible janitor intervention).
Want more info? How to link or How to display code and escape characters are good places to start.


There's more than one way to do things
	PerlMonks