It happens exactly as described in perlre. For the long version pick up Mastering Regular Expressions.

The actual way it works is rather complex because of all of the optimizations, but the naive behaviour that it falls back on is pretty simple. It starts at the beginning of the string and the beginning of the RE. It proceeds through the string and the RE, every time it has to make a choice memorizing that spot. Eventually it probably gets into a dead end (the next character you are looking for is "k" and you saw "m", aw shucks) and then backs up to the last choice it had and goes with the next option it has not tried.

Stop and think about it, it is proceeding left to right in the string and basically left to right in the RE (wildcards can result in looping around in the RE though) in the most obvious manner possible.

Now you may hear that (.*) is greedy, while (.*?) is not. How does that work? Well it is simple. Remember that it has to make choices? Well with either construct it has a choice when it matches a ".". It can try to match another right away, or it can try to proceed. With (.*) it tries to match "." again, with (.*?) it will try to proceed through the RE first. So (.*) will wind up matching as many .'s as it can while still managing to match overall while (.*?) will match as few. (Greedy vs non-greedy.)

Now sit down with perlre and see if you can figure out the idea behind how it is implemented. When you feel comfortable and visit Death to Dot Star! for some of the gotchas. :-)


In reply to Re (tilly) 1: Regexp evaluation by tilly
in thread Regexp evaluation by Malkavian

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.