... I've been experimenting again with using Perl regexes more like grammars ...

My kneejerk reaction (and, I suspect, that of many others) on reading something like this is, "Why not just use a grammar?" At least in Perl 5, regexes can be quite useful when applied to limited grammar parsing tasks, but will quickly run out of steam and fall over (or else become monstrously unwieldy) when pushed beyond a certain limit. I suspect the situation may be different with Perl 6, but I must confess ignorance here. In any event, my impression is that the Perl 6 regex compiler/engine is so radically different from that of Perl 5 that feature back-ports can only be piecemeal and incremental.

Further, It's easy to come up with limited approaches to deal with oddball problems, e.g., dynamically varying quantifier counts:

c:\@Work\Perl>perl -wMstrict -MData::Dump -le "my $s = '04abcdefgh06ABCDEFGH05tuvwxyz'; my @caps = map join('', m{ \A (\d\d) }xms, m{ ([[:alpha:]]{$1}) }xms), $s =~ m{ \G \d\d [[:alpha:]]+ }xmsg ; dd @caps; " ("04abcd", "06ABCDEF", "05tuvwx")
Something like this can take care of an immediate problem, but obviously isn't going to integrate well into an even moderately large parser application — and then we're back to the "get a real parser" stance.

Please don't take these remarks as an attack on your interest in the problem. I share and applaud that interest and will upvote the OP as soon as I finish posting this. I love regexes too, but we must learn to recognize the limitations even of those we love most.


Give a man a fish:  <%-(-(-(-<


In reply to Re: Advanced techniques with regex quantifiers by AnomalousMonk
in thread Advanced techniques with regex quantifiers by smls

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.