Consider the following code:

local $_ = 'foo'; say 'Start' if /\G foo/gcx; say 'Mid' if /\G .*/gcx; say 'End' if /\G \z/gcx;

The output will be?

Start Mid

Change the "Mid" case to this:

say 'Mid' if /\G .+/gcx;

And now the output will be:

Start Mid End

So all three conditions match. If you use the following quantifiers at the end of the 2nd expression, /z will not match in the third expression:*, ?, {0,}.

This is confirmed on Perl 5.26, and 5.10.

Similarly:

perl -E 'local $_ = "foo\n"; say "Start" if /\G foo/gcx; say "Mid" if +/\G .*/gcx; say "End" if /\G (?=\n)/gcx'

So in this case we added a \n to the string, matched on .* for our "Mid" expression. Then did a lookahead assertion for \n in the "End" expression. Since we are not using the /s modifier, .* should have stopped before \n, so (?=\n) should still find newline (I think), so the "End" condition should be true.

I'm feeling like the difference between how .+ and .* are consuming the string (/z matching in the 3rd expression when the 2nd expression uses .+, but not matching if .*) is an inconsistency that can't be defended as not being a bug, but I'm interested in what others take on it might be.


Dave


In reply to Seeking clarification on possible bug in regex using \G and /gc by davido

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.