davido has asked for the wisdom of the Perl Monks concerning the following question:
Consider the following code:
local $_ = 'foo'; say 'Start' if /\G foo/gcx; say 'Mid' if /\G .*/gcx; say 'End' if /\G \z/gcx;
The output will be?
Start Mid
Change the "Mid" case to this:
say 'Mid' if /\G .+/gcx;
And now the output will be:
Start Mid End
So all three conditions match. If you use the following quantifiers at the end of the 2nd expression, /z will not match in the third expression:*, ?, {0,}.
This is confirmed on Perl 5.26, and 5.10.
Similarly:
perl -E 'local $_ = "foo\n"; say "Start" if /\G foo/gcx; say "Mid" if +/\G .*/gcx; say "End" if /\G (?=\n)/gcx'
So in this case we added a \n to the string, matched on .* for our "Mid" expression. Then did a lookahead assertion for \n in the "End" expression. Since we are not using the /s modifier, .* should have stopped before \n, so (?=\n) should still find newline (I think), so the "End" condition should be true.
I'm feeling like the difference between how .+ and .* are consuming the string (/z matching in the 3rd expression when the 2nd expression uses .+, but not matching if .*) is an inconsistency that can't be defended as not being a bug, but I'm interested in what others take on it might be.
Dave
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re: Seeking clarification on possible bug in regex using \G and /gc
by choroba (Cardinal) on Mar 14, 2018 at 23:05 UTC | |
by davido (Cardinal) on Mar 15, 2018 at 01:45 UTC | |
by Rhandom (Curate) on Mar 14, 2018 at 23:24 UTC | |
by choroba (Cardinal) on Mar 14, 2018 at 23:52 UTC | |
|
Re: Seeking clarification on possible bug in regex using \G and /gc
by tybalt89 (Monsignor) on Mar 14, 2018 at 23:10 UTC | |
by Rhandom (Curate) on Mar 14, 2018 at 23:22 UTC | |
by Anonymous Monk on Mar 15, 2018 at 18:54 UTC |