Beefy Boxes and Bandwidth Generously Provided by pair Networks
good chemistry is complicated,
and a little bit messy -LW

comment on

( #3333=superdoc: print w/replies, xml ) Need Help??
If you want to avoid PRD, there's a few things you can do:
  • Write an event parser for your language:
    • pass events to an event handler object
    • the two tokens 'page foo' generate an event new_type("page","foo")which creates a new elem as a child of the element at the top of a stack
    • The token { puts the last child of the top of the stack on the top of the stack
    • the } token pops an element from the stack
    • anything that is not recognized as a "new thing" structure ((\w+)\s+(?:(\w+)\s+)?\{) is globbed up, and passed to the 'character_data' event, in your case, probably one per line
    • the event handler has a 'root' element predefined, at the top of the stack
  • use something like the event parser to convert the language with no state into XML or YAML or whatever, and use a parser for that
  • use ??{ } in regexes in a similar manner to the event parser handler. If you're going that way, you can nest expressions using ??{ }. See perlre for some devious tricks you can do with this construct. /msg me if you would like me to post an example.
Update: it's done. it was fun, but don't use it. Someone below implemented the event parser I was talking about, just not in a decoupled OO kind of way.
use strict; use warnings; use re 'eval'; my $str = <<FOO; page p1 { question 4B { label { Do you like your pie with ice cream? } single { 1 Yes 2 No } } question 4C { label { Do you like your pie with whipped cream? } single { 1 Yes 2 No } } } FOO my $string = qr/ ^ (?> \s* (.+) ) \s* $ (?{ add_string($^N) }) /xm; my $tokens; my ($type, $name); my $block = qr/ # capture a type (?: (\w+) \s+ ) (?{ $type = $^N }) ( # capture an optional name, set $name to that (?{ $name = undef }) # first unset $name, in case this doesn't + match ((?: (\w+) \s+ )(?{ $name = $^N }) )? ) \{ # if this starts to look like an element, push a new cell on th +e stack (?{ new_elem($type, $name) }) ( ( # this subpattern tries to capture a complete body, with t +he closing brace (??{ $tokens }) \} (?{ close_elem() }) # if we got here it means we have a fu +ll body, with tokens and a closing brace ) | ( # if we got here, then the body subpattern failed, and we + must abort (?{ abort_elem() }) (?!) # this match always fails because it negates a match +on anything, that always succeeds ) ) /xs; my $blocks = qr/($block \s*)+/xs; my $strings = qr/($string \s*)+?/xs; $tokens = qr/\s* ( $blocks | $strings ) \s*/xs; # tokens is either som +e strings, or some blocks my $doc = qr/^$tokens$/s; my @stack; new_elem("doc" => "root"); # create the root element $str =~ $doc; use Data::Dumper; warn Dumper(@stack); # should contain just the root element sub new_elem { my $elem = { type => $_[0], (defined($_[1]) ? (name => $_[1]) : ()), contains => [], }; if (@stack){ push @{ $stack[-1]{contains} }, $elem } push @stack, $elem; } sub abort_elem { pop @stack; pop @{ $stack[-1]{contains} }; } sub close_elem { pop @stack } sub add_string { push @{ $stack[-1]{contains} }, $_[0] }
zz zZ Z Z #!perl

In reply to Re: Parsing a macro language by nothingmuch
in thread Parsing a macro language by bluetrust

Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post; it's "PerlMonks-approved HTML":

  • Are you posting in the right place? Check out Where do I post X? to know for sure.
  • Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
    <code> <a> <b> <big> <blockquote> <br /> <dd> <dl> <dt> <em> <font> <h1> <h2> <h3> <h4> <h5> <h6> <hr /> <i> <li> <nbsp> <ol> <p> <small> <strike> <strong> <sub> <sup> <table> <td> <th> <tr> <tt> <u> <ul>
  • Snippets of code should be wrapped in <code> tags not <pre> tags. In fact, <pre> tags should generally be avoided. If they must be used, extreme care should be taken to ensure that their contents do not have long lines (<70 chars), in order to prevent horizontal scrolling (and possible janitor intervention).
  • Want more info? How to link or How to display code and escape characters are good places to start.
Log In?

What's my password?
Create A New User
Domain Nodelet?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others exploiting the Monastery: (2)
As of 2023-02-09 02:51 GMT
Find Nodes?
    Voting Booth?
    I prefer not to run the latest version of Perl because:

    Results (44 votes). Check out past polls.