Corion and haukex have already referred to the likelihood that in your situation, adding a literal '+' to the match constrains the otherwise "greedy" .* match to stop with the first occurrence of the regex pattern. They have also recommended much more fundamentally robust approaches to solving your problem.

I'd love to know why.

WRT regex mechanics, I hope I can provide a detailed answer to your prayer. As already mentioned, this behavior can be demonstrated using any character (or, indeed, substring) as an explicit "anchor" for the match:

c:\@Work\Perl\monks>perl -wMstrict -le "my $s = 'xxx xyzzyfooAbar yyy xyzzyzotBbar zzz'; ;; my $match; ;; print qq{A: .*: '$match'} if ($match) = $s =~ m{ (xyzzy .* bar) +}xms; print qq{B: .* A: '$match'} if ($match) = $s =~ m{ (xyzzy .* A bar) +}xms; print qq{C: .*?: '$match'} if ($match) = $s =~ m{ (xyzzy .*? bar) +}xms; " A: .*: 'xyzzyfooAbar yyy xyzzyzotBbar' B: .* A: 'xyzzyfooAbar' C: .*?: 'xyzzyfooAbar'

In example A, the greedy .* match grabs as much as it can (to the end of the string in this case), but then the regex engine backtracks until the first point at which it can match an explicit 'bar' substring. Unfortunately, this gives you a bit more than you want even in the absence of the  /g modifier: the regex engine strives for the leftmost, longest match.

In example B, .* still grabs as much as it can (to the end of the string), but then the regex engine backtracks until it can match an explicit 'A' substring. Then matching moves forward again to find the 'bar' substring.

In example C, the "lazy" modifier ? of the .*? match means that it will match as little as possible to achieve an overall match with 'bar'. No backtracking occurs.

Update: Corrected a couple of trivial spelling/formatting errors.


Give a man a fish:  <%-{-{-{-<


In reply to Re: If I'm matching a pattern wy does a + sign make things crazy? by AnomalousMonk
in thread If I'm matching a pattern wy does a + sign make things crazy? by SergioQ

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.