shankonit has asked for the wisdom of the Perl Monks concerning the following question:

Please help me to understand these two concepts in regular expression like . and .*.

Thank u in advance for your time taken to read and answer

  • Comment on What is greedy and lazy Matching in perl

Replies are listed 'Best First'.
Re: What is greedy and lazy Matching in perl
by stevieb (Canon) on Jul 29, 2015 at 17:41 UTC

    A dot '.' character unescaped matches exactly one of any character (except newline by default). .* means match any character zero or more times but as many times as possible (it'll eat up the entire string up to the first newline). This is greedy. Essentially, greedy means match as much as absolutely possible while allowing the rest of the regex to still match.

    To make something non-greedy, use a ? after the .*. It will then match as much as it can, as few times as possible.

    my $str = "hello there world";

    The following captures 'world', as the first .* grabs any char up to a space. Because it's greedy, it doesn't stop until the last space, and the rest (the second .*) is captured.

    $str =~ /.*\s(.*)/;

    The regex below captures 'there world'. The non-greedy modifier after the first .* says any char up to a space, but stop at the first opportunity you can (the first space). The (.*) captures everything else.

    $str =~ /.*?\s(.*)/;

    See perlretut for much more information.

    -stevieb

Re: What is greedy and lazy Matching in perl
by Your Mother (Archbishop) on Jul 29, 2015 at 17:24 UTC
Re: What are greedy and lazy matching in Perl?
by Athanasius (Archbishop) on Jul 30, 2015 at 03:20 UTC

    Hello shankonit,

    If you have access to the Camel Book (4th Edition, 2012), you should look at Chapter 5, “Pattern Matching,” especially the section “The Little Engine That /Could(n’t)?/” (pages 241–6). Rule 5 covers the behaviour of quantifiers; the final two paragraphs nicely explain the difference between greedy and non-greedy matching in terms of “backward” vs. “forward” backtracking.

    Note for pedants: Messrs Christiansen, d foy, Wall, and Orwant use the term frugal as a synonym for non-greedy. :-)

    Hope that helps,

    Athanasius <°(((><contra mundum Iustus alius egestas vitae, eros Piratica,

Re: What is greedy and lazy Matching in perl
by pme (Monsignor) on Jul 29, 2015 at 17:46 UTC
Re: What is greedy and lazy Matching in perl
by BrowserUk (Patriarch) on Jul 29, 2015 at 23:47 UTC

      Hello BrowserUk,

      I don’t think it’s pedantic (in the pejorative sense, anyway) to insist on correct terminology. Often, when one is trying to understand a new technical concept, getting the terminology clear is half the battle. For regex quantifiers, the distinction is between greedy, on the one hand, and non-greedy or frugal (see below) on the other. And “lazy” does suggest lazy evaluation, which has no application here.

      But... if “lazy” is taken in its more general sense of “doing less work,” then it may be worth noting here that frugal quantifiers do less work — and are in that sense “lazier” — than their greedy counterparts:

      use strict; use warnings; use Benchmark 'cmpthese'; my $date = '(30-Jul-2015)'; my $string = 'A' x 1e5 . $date . 'B' x 1e5; cmpthese ( 1e4, { Greedy => sub { $string =~ / ^ .* ( \( \d{2} - \w{3} - \d{4} \) + ) /x; $1 eq $date or die $!; }, Frugal => sub { $string =~ / ^ .*? ( \( \d{2} - \w{3} - \d{4} \) + ) /x; $1 eq $date or die $!; }, } );

      Typical output:

      17:36 >perl 1323_SoPW.pl Rate Greedy Frugal Greedy 1068/s -- -70% Frugal 3555/s 233% -- 17:37 >

      Of course, this information will be useful only for those situations in which it can be known in advance that a frugal match is guaranteed to produce the same result as its greedy equivalent. So far I haven’t thought of any practical examples that qualify. But “no knowledge is ever wasted,” as my mother always says.

      Yours in pedantry, :-)

      Athanasius <°(((><contra mundum Iustus alius egestas vitae, eros Piratica,

        ... if “lazy” is taken in its more general sense of “doing less work,” then ... frugal quantifiers do less work — and are in that sense “lazier” ...

        But the first definition of the first citation of lazy given here suggests that rather than just passively happening to do less, "lazy" actively avoids effort: "Disinclined to action or exertion; averse to labor; ...; shirking work." This seems to me to capture the essence of the behavior of lazy quantification.

        I can be happy with frugal, but I'm too lazy to type the extra characters. Non-greedy it a bit too Newspeak for me.


        Give a man a fish:  <%-(-(-(-<

Re: What is greedy and lazy Matching in perl
by Anonymous Monk on Jul 29, 2015 at 23:28 UTC
Re: What is greedy and lazy Matching in perl
by Anonymous Monk on Jul 30, 2015 at 00:24 UTC
    How much of the string GOFOOGOBARGO will the pattern .*GO match? Greedy = all of it. Non-greedy = only the first two characters.