in reply to Re^3: why my reg ex matches greedy?
in thread why my reg ex matches greedy?

And I thought the \d{11,}? was the invented bit. I'll have to play with that sometime.

You should, the future of all new perl regex features rests upon that syntax

Replies are listed 'Best First'.
Re^5: why my reg ex matches greedy?
by roboticus (Chancellor) on Jun 26, 2012 at 12:14 UTC

    I don't see the relation between your link and that bit of the regex. However, that said, your skepticism of its utility seems justified. I tried to find a use for it, but I haven't been able to make \d{4,}? act any differently than \d{4}. It's either a useless construct, or a failure of my imagination in coming up with an appropriate test case.

    Putting aside what YAPE::Regex::Explain says about it, when I looked at it originally, I thought "Yack! Perl is gonna bitch about that weird '?' character". I could think of a couple other interpretations, so I put together a bit of code to check 'em out:

    my @tests = ( 'First case', '123456789second & third case', ); for my $t (@tests) { print "\nchecking '$t'\n"; + if ($t=~/(\d{4,}?)(.*)/) { print "A: $1, $2\n"; } if ($t=~/(\d{4,}?)(.*?)$/) { print "B: $1, $2\n"; } }

    The other interpretations I could think of were:

    • An optional set of 4 or more digits, kind of like (?:\d{4,})?. If true, the first case would give us:

      A: , First case'
    • Exactly 4 digits, like \d{4}, giving us:

      checking '123456789second & third case' A: 1234, 56789second & third case B: 1234, 56789second & third case
    • 4 or more digits, with as few as possible, yielding:

      checking '123456789second & third case' A: 1234, 56789second & third case B: 123456789, second & third case

    On reading the ...explain() output, I thought that I could perhaps make the third case come about. But what I actually got was:

    $ perl xxxyyyzzz.pl checking 'First case' checking '123456789second & third case' A: 1234, 56789second & third case B: 1234, 56789second & third case

    So I'm thinking that my initial surprise was justified, even though it's syntactically correct.

    ...roboticus

    When your only tool is a hammer, all problems look like your thumb.

      It's different if there is some following match condition:

      use strict; use warnings; for my $reg ('(x\d{3,}?)', '(x\d{3,}?x)', '(x\d{3,})', '(x\d{3})') { for my $str ('xx', 'x12x', 'x123456x', 'x12x x123x') { print "Matched using $reg: $1\n" if $str =~ $reg; } }

      Prints:

      Matched using (x\d{3,}?): x123 Matched using (x\d{3,}?): x123 Matched using (x\d{3,}?x): x123456x Matched using (x\d{3,}?x): x123x Matched using (x\d{3,}): x123456 Matched using (x\d{3,}): x123 Matched using (x\d{3}): x123 Matched using (x\d{3}): x123
      True laziness is hard work

        GrandFather:

        Thanks. I updated your code to make it a little more visually obvious to me, then added a couple cases. Now that I see what the difference is, I doubt that I'd ever use it. Not because it isn't useful, but rather because if I ever need it, I'm sure I'll have long forgotten it. But it's certainly educational.

        For grins, here's what I came up with:

        $ cat splok.pl use strict; use warnings; my @regs = ('(x\d{3,}?)', '(x\d{3,}?x)', '(x\d{3,})', '(x\d{3})', '(x\d{3,}x)', '(x\d{3,}?x?)'); my @strs = ('xx', 'x12x', 'x123456x', 'x12x x123x', 'x123456y'); printf "%-12.12s ", $_ for " ", @regs; print "\n"; for my $str (@strs) { printf "%-12.12s ", $str; for my $reg (@regs) { printf "%-12.12s ", ($str=~$reg) ? $1: '-nope-'; } print "\n"; } $ perl splok.pl (x\d{3,}?) (x\d{3,}?x) (x\d{3,}) (x\d{3}) (x\d{ +3,}x) (x\d{3,}?x?) xx -nope- -nope- -nope- -nope- -nope +- -nope- x12x -nope- -nope- -nope- -nope- -nope +- -nope- x123456x x123 x123456x x123456 x123 x1234 +56x x123 x12x x123x x123 x123x x123 x123 x123x + x123x x123456y x123 -nope- x123456 x123 -nope +- x123

        (As you can tell, I like things laid out in grids. I organize lots of stuff with database tables and spreadsheets...)

        ...roboticus

        When your only tool is a hammer, all problems look like your thumb.