in reply to Interesting Regex Behavior

My amateurish guesses are as follows

In the first pattern match because you're using the * quantifier on the first set of parens nothing is matched be cause it's a lazy quantifier and $1 gets nothing, and $2 is undefined since it isn't even executed.

In the second pattern match the * goes to the end of the string, matches 'b', backtracks to the 'a', and the goes forward to the 'b', which respectively explains the $DIGITs.

I suspect both guesses are pretty close to the truth, but I couldn't say for sure not being at one with the perl regex engine and all ;)
HTH

_________
broquaint

Replies are listed 'Best First'.
Re: Re: Interesting Regex Behavior
by japhy (Canon) on Nov 22, 2002 at 15:43 UTC
    You're close, but a little off.

    In the first example, the quantifiers aren't lazy, but greedy, and THAT is why $2 is undef and $1 holds the empty string. First, the (a)* matches the "a", and stores "a" in $2, and so $1 is "a" as well. Then the outermost * makes the capturing block try again, and this time (a)* matches ZERO "a"s. Here's the trick: "y" =~ /(x)?/ stores undef in $1, and it succeeds. Therefore, $2 becomes undef, and $1 becomes the empty string.

    The second regex works thus. First, the (a) matches the "a" ($1 is "a", $2 is "a", and $3 is undef). Then the (b) matches the "b", but it does NOT reset $2's value to undef, even though (a) didn't match. Therefore, $1 is "b" (note: "japhy" =~ /(\w)+/ stores "y" in $1), $2 is "a", and $3 is "b".

    _____________________________________________________
    Jeff[japhy]Pinyan: Perl, regex, and perl hacker, who'd like a job (NYC-area)
    s++=END;++y(;-P)}y js++=;shajsj<++y(p-q)}?print:??;