for the sake of the discussion I would like to learn how the lazy operator works and why it isn't working in this case

Conceptually (disregarding optimizations, for instance), this is how it works:

m{ / (.+?) $ }x # for readability
It will try the string character by character. So, first of all, it will try to match 'forward slash'. That will match
/ # matched so far

Then, the expression .+? is really the same as ..*? So the regex engine will try to match any character except newline (for the first dot). That will match

/f # matched so far

Then, it will come to a choice. .* means '0 or more'. First of all, the engine will save its state. And it will try to match nothing for .* That will match (empty match always matches)

/f # matched so far; decision point is saved

Then, it will try to match 'dollar' - end of string or just before the newline at the end. That will fail, because the end of the string won't be reached yet.

Then, it will backtrack - the engine will load the previous 'saved state' and will try the other decision in an attempt to match. It will try to match something (rather than the empty string). That will match.

/fo # matched so far
Then, it will save its state and try to match nothing, which will be successfull
/fo # matched so far, decision point saved
Then it will try to match the end of line again. In case of failure, it will reload the previous state and try to match something instead
/foo #matched so far
The engine will keep doing that, alternating between decisions, until it'll reach the end of line.

Considerable optimizations are possible here, as you might have noticed (and Perl's engine is heavily optimized). But, in principle, this is how it should work for the kind of a regex engine that Perl uses


In reply to Re: help with lazy matching by Anonymous Monk
in thread help with lazy matching by Special_K

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.