Your skill will accomplish what the force of many cannot |
|
PerlMonks |
Re^8: Understanding a portion of perlretutby AnomalousMonk (Archbishop) |
on Dec 10, 2015 at 22:22 UTC ( [id://1149956]=note: print w/replies, xml ) | Need Help?? |
Is your Supplemental question meant rhetorically? It was meant rhetorically, but I'm glad you enjoyed it! ... my $s = 'abCdefC'; while ($s =~ / (f)*? C /gx) { ... } I think choroba has already well addressed the issues you raised in the paragraph following the one from which this is quoted, but let me try to address this one specifically — insofar as I understand what's going on and assuming I understand your question! In the code below, I think we're both happy that the (f)*? capture group acting before the first 'C' in the string is allowed not to match at all, and in that case the value of the capture variable ($1 in the code) is undef. I think we can agree that if the group expression were changed to (f*?) it would also match, capturing the empty string to $1. The second 'C' in the string is preceded by an 'f'. Why do both (f)*? and (f*?) capture the 'f' when they can be satisfied with nothing and need not be satisfied with anything more than nothing (i.e., they both do lazy matching)? Here's my story. If the RE matches nothing at offset 5 (the 'f'), it must then match a 'C' at offset 5, which is already occupied by an 'f', in order to satisfy the overall regex! The RE must first "consume" the 'f' at offset 5 before it can advance to match the 'C' at offset 6 for an overall match.
But here's a non-rhetorical question. In the code below, notice that there is a peculiar double-step at pos 3. The 'f' at offset 2 is first not captured (either as undef or as the empty string), then captured. I don't get it: a non-zero-width match is never a necessity for an overall match. Why not just step over the 'f' at offset 2 in the same way all the other characters are stepped over? (This code produces the same output under Strawberries 5.10, 5.12 and 5.14. However, when run under ActiveState 5.8.9, the output is the same except that $1 is always undef! I assume this is a bug that was fixed between 5.8.9 and 5.10.x, or maybe between AS and Strawberry.) Update: Consider also the second code example with a string of 'xxfffxx' for similar perplexity. Give a man a fish: <%-{-{-{-<
In Section
Seekers of Perl Wisdom
|
|