I find regular expressions in general to be the least intuitive, most surprising computer "language" I've encountered (technically, they're a Domain-Specific Embedded Language, a DSEL) that is actually intended to be practical and useful rather than merely obscure; for an example of the latter, see Brainfuck, the source code of which looks remarkably like a traditional "line noise" regex definition.
My favorite example of this counter-intuitiveness is the result of matching the simple regex /(b*)/ against the string 'aaaaabbb':
'aaaaabbb' =~ /(b*)/;
What will be matched and captured to $1 and where will the match occur? Knowing that matching is, by default, "greedy" and matches as much as possible, one's first thought might be as mine has often been, that it will match/capture 'bbb' at offset 5 in the string. Contemplation of the "Leftmost, Longest" rule for regex matching would seem to support this initial idea: offset 5 is the leftmost position at which the most 'b' characters are found — all of them in fact.
A simple experiment shows we are deceived:
(The @- array holds the offset of the start of each corresponding capture group match. See the Variables related to regular expressions section of perlvar.)c:\@Work\Perl\monks>perl -wMstrict -le "print qq{matched '$1' at offset $-[1]} if 'aaaaabbb' =~ /(b*)/; " matched '' at offset 0
I do not show you these things in order to discourage you, but rather to steel you against the frustrations and perplexities that inevitably accompany the study and use of regular expressions.
Give a man a fish: <%-(-(-(-<
In reply to Re: exist backreference variable list?
by AnomalousMonk
in thread exist backreference variable list?
by PerlJam2015
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |