in reply to help with lazy matching
for the sake of the discussion I would like to learn how the lazy operator works and why it isn't working in this case
Conceptually (disregarding optimizations, for instance), this is how it works:
It will try the string character by character. So, first of all, it will try to match 'forward slash'. That will matchm{ / (.+?) $ }x # for readability
/ # matched so far
Then, the expression .+? is really the same as ..*? So the regex engine will try to match any character except newline (for the first dot). That will match
/f # matched so far
Then, it will come to a choice. .* means '0 or more'. First of all, the engine will save its state. And it will try to match nothing for .* That will match (empty match always matches)
/f # matched so far; decision point is saved
Then, it will try to match 'dollar' - end of string or just before the newline at the end. That will fail, because the end of the string won't be reached yet.
Then, it will backtrack - the engine will load the previous 'saved state' and will try the other decision in an attempt to match. It will try to match something (rather than the empty string). That will match.
Then, it will save its state and try to match nothing, which will be successfull/fo # matched so far
Then it will try to match the end of line again. In case of failure, it will reload the previous state and try to match something instead/fo # matched so far, decision point saved
The engine will keep doing that, alternating between decisions, until it'll reach the end of line./foo #matched so far
Considerable optimizations are possible here, as you might have noticed (and Perl's engine is heavily optimized). But, in principle, this is how it should work for the kind of a regex engine that Perl uses
|
|---|