Bother, lots of problems. foo=foo=foo isn't supposed to be allowed, but the post-processing phase could check for that; .*?D really shouldn't be slower than [^D]*D - I'll have to take a look why that happens and see if it can be fixed (the minimal matching support was added to perl relatively recently, and it hasn't had the same degree of optimisation that the older codepaths have had)

I think the [^\\] is marginally clearer about its intent than . would be, though I accept the point; the missing /s and support for the colon delimiter were simply oversights; and the tail assertion should have been (?= \s | \z ).

So let's try again:

while (pos($_) < length($_)) { if (m{ \G (\w+) \s* = \s* (?=\S) }gcx) { # key-value pair is fixed-up in post-processing push @args [ $1 ]; } elsif (m{ \G (\w+) (?= \s | \z ) \s* }gcx) { push @args, $1; } elsif (m{ \G (['"]) ( \\. | [^\\] )*? \1 (?= \s | \z ) \s* }gcxs) { (my $quoted = $2) =~ s/\\(.)/$1/g; push @args, $quoted; } elsif (m{ \G (:) \s+ }) { push @args, $1; } else { die "parsing error\n"; } } for (my $i = 0; $i < @args; ++$i) { next unless ref $args[$i]; my $value = splice @args, $i+1, 1; die "parsing error\n" if !defined($value) || ref $value; $args[$i] = [ $args[$i], $value ]; }

Update: string inconsistently used as $tag and $_, fixed up to use $_ throughout.

Hugo

In reply to Re: Re: Re: Re: Parsing arguments by hv
in thread Parsing arguments by hv

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.