in reply to Greedy modifier found to be working non-greedy in a named group

$_ = "This is a teeeext for testting"; /(?<char>e*)/ and print "'$+{ch +ar}' is matched pattern at $-[0]\n";

Outputs:

'' is matched pattern at 0

e* matches at the beginning of the string, try e+

Replies are listed 'Best First'.
Re^2: Greedy modifier found to be working non-greedy in a named group
by rkabhi (Acolyte) on Nov 29, 2019 at 11:44 UTC
    @tybalt89 I have tried e+ already and I know that it works. But e* should also have worked because '*' is a greedy modifier unless it is succeeded by a '?'. So, in my example, both e* and e+ should have given same output. Why is the difference in output?

      The greedy operators still implement a leftmost-longest strategy. That means that the leftmost match will win even if you find a longer match later.

      e* can match zero e. The leftmost place where you can match zero e is at the start of the string.

      A greedy match is the longest representation of the first match. In the case of /e*/ the first match is the start of the string, and is zero characters long because the first character is not an 'e'. In the case of /e+/ the first match starts at the first 'e' and continues until it finds a non-'e' character.

      > both e* and e+ should have given same output. Why is the difference in output?

      Good question, I think many are confused.

      It's important to understand that empty matches exist and that e* means e{0}|e+ ( or something like e{0,32766} ° ).

      So you are actually matching e{0} before all!

      Printing out the match position of a capture group via @+ helps demonstrating it

      DB<1> $_ = "This is a teeeext for testting"; DB<2> ;/(?<char>e*)/ and print "'$+{char}' is matched pattern at p +os $+[0]\n"; '' is matched pattern at pos 0 DB<3> ;/(?<char>e{0})/ and print "'$+{char}' is matched pattern at + pos $+[0]\n"; '' is matched pattern at pos 0 DB<4> ;/(?<char>e+)/ and print "'$+{char}' is matched pattern at p +os $+[0]\n"; 'eeee' is matched pattern at pos 15 DB<5> ;/(?<char>e{1,32766})/ and print "'$+{char}' is matched patt +ern at pos $+[0]\n"; 'eeee' is matched pattern at pos 15 DB<6>

      update
      • Actually $+[0] gives you the end of the first match, $-[0] will give you the start.
      • e* is not limited in my Perl version but the upper bound in e{,} is.
      • couldn't find positions of named groups

      Cheers Rolf
      (addicted to the Perl Programming Language :)
      Wikisyntax for the Monastery FootballPerl is like chess, only without the dice

      °) yes there is an maximal upper bound in range quantifiers, which I expected to be around 2**16, so it's an incomplete analogy

        > ...and that e* means e{0}|e+ ( or something like e{0,32766} ° ).

        > So you are actually matching e{0} before all!


        And if the sentence begins with "e.."? :)
        That was very good explanation.

        Thanks a lot !!

        Regards,
        Abhishek

      Thanks, its clear now.