Hello all,

I need to parse property files that might have multiline entries (otherwise a simple regexp would do) and I need to process comments and maintain order (otherwise Config::Properties would do)

I thought to use a positive lookahead to define a var=val pair to start with ^var= and end just before the next ^var=

I am probably needing a nudge in using either the skip directive, or the proper pattern modifier....but here is sample code:

use strict; use Parse::RecDescent; $::TestGrammar = <<'TG'; Output: PropLine(s) /\Z/ PropLine: CommentLine | SimpleProp | LastProp CommentLine: /\#.*\n/ { print "RULE: $item{__RULE__}\n"; print "MATCH: $item{__PATTERN1__}\n"; } SimpleProp: VAR EQ VAL ...LastProp { print "RULE: $item{__RULE__}\n" +; print "VAR: $item{VAR}\n"; print "EQ: $item{EQ}\n"; print "VAL: $item{VAL}\n\n"; } LastProp: VAR EQ VAL { print "RULE: $item{__RULE__}\n"; print "VAR: $item{VAR}\n"; print "EQ: $item{EQ}\n"; print "VAL: $item{VAL}\n\n"; } VAR: /[^=]+/ EQ: '=' VAL: /.*/ TG undef $/; my $foo = <>; my $parser = Parse::RecDescent->new($::TestGrammar); defined $parser->Output($foo) or die "FAILURE"; __END__ When I run this using this file: =================== # Comment Line # Comment #2 foo=this is property one but bar=does it grab this one too? baz=snark =================== I see this, as expected: =================== RULE: CommentLine MATCH: # Comment Line RULE: CommentLine MATCH: # Comment #2 RULE: SimpleProp VAR: foo EQ: = VAL: this is property one but RULE: SimpleProp VAR: bar EQ: = VAL: does it grab this one too? RULE: LastProp VAR: baz EQ: = VAL: snark =================== However, if I make one of the props multiline: =================== # Comment Line # Comment #2 foo=this is property one but bar=does it grab this one too? baz=snark =================== I see this: =================== RULE: CommentLine MATCH: # Comment Line RULE: CommentLine MATCH: # Comment #2 RULE: SimpleProp VAR: foo EQ: = VAL: this is property <----the rest of this VAL becomes part of the next VAR. no joy. RULE: SimpleProp VAR: one but bar EQ: = VAL: does it grab this one too? RULE: LastProp VAR: baz EQ: = VAL: snark ===================

I am trying things like changing VAR: to /^[^=]+/m and such but have yet to find the right combination.
Thank you in advance for any comments you can make.

In reply to Modifying Parse::RecDescent Grammar to deal with multiline property file entries by chahn

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.