pglenski has asked for the wisdom of the Perl Monks concerning the following question:

I copied this code from the tutorial page but it doesn't work. Tutorial by Roy Johnson. Perhaps my $_ is wrong. I just want to capture "a". Any ideas?
use English; $_ = "foo a baz bar"; /foo # Match starting at foo (?: # Complex expression: (?!baz) # make sure we're not at the beginning of baz . # accept any character )* # any number of times bar # and ending at bar /x; print "$_ \n"; print " \$1 = $1 \n"; print " \$2 = $2 \n"; print " \$& = $MATCH \n"; print " \$` = $PREMATCH \n"; print " \$' = $POSTMATCH \n"; print " \$+ = $LAST_PAREN_MATCH \n";

Replies are listed 'Best First'.
Re: Regex look ahead
by Tanktalus (Canon) on Aug 30, 2006 at 20:12 UTC

    Well, if you read the comments there, we start matching at foo. We match character by character (unless we get to the beginning of baz). Then, we must match bar.

    Going through your string, we start at foo. Good. We check if we're at the beginning of baz - we aren't. Grab a character (' '). Still not at the beginning of baz, so we grab another character (' a'). Still not at the beginning of baz, so we grab another character (' a '). Now we're at the beginning of baz, so we obviously went too far - back off (' a'), and exit the "loop". Next we need to match 'bar' - but 'bar' isn't here. Thus we failed to match anything. (Actually, we'll back off the two characters that we did match in the complex expression, but that will still fail to find 'bar'.)

    One rule of thumb is to always check if a match succeeds before using any of the special variables. So try something like this:

    if ( /foo # Match starting at foo (?: # Complex expression: (?!baz) # make sure we're not at the beginning of baz . # accept any character )* # any number of times bar # and ending at bar /x ) { # print stuff out. } else { print "Failed to match against '$_'\n"; }
    Hope that helps,

Re: Regex look ahead
by imp (Priest) on Aug 31, 2006 at 04:52 UTC
    Tanktalus provided an excellent decription of how the regular expression attempted to match your string. If you would like to experiment with the regex some more I recommend using YAPE::Regex::Explain to get english descriptions of how a given regular expression works. It's much easier to read than the output of use re 'debug'

    use strict; use warnings; use YAPE::Regex::Explain; my $regex = qr/foo # Match starting at foo (?: # Complex expression: (?!baz) # make sure we're not at the beginning of baz . # accept any character )* # any number of times bar # and ending at bar /x; my $explanation = YAPE::Regex::Explain->new($regex)->explain; print $explanation;
    Outputs:
    The regular expression: (?x-ims:foo # Match starting at foo (?: # Complex expression: (?!baz) # make sure we're not at the beginning of baz . # accept any character )* # any number of times bar # and ending at bar ) matches as follows: NODE EXPLANATION ---------------------------------------------------------------------- (?x-ims: group, but do not capture (disregarding whitespace and comments) (case-sensitive) (with ^ and $ matching normally) (with . not matching \n): ---------------------------------------------------------------------- foo 'foo' ---------------------------------------------------------------------- (?: group, but do not capture (0 or more times (matching the most amount possible)): ---------------------------------------------------------------------- (?! look ahead to see if there is not: ---------------------------------------------------------------------- baz 'baz' ---------------------------------------------------------------------- ) end of look-ahead ---------------------------------------------------------------------- . any character except \n ---------------------------------------------------------------------- )* end of grouping ---------------------------------------------------------------------- bar 'bar' ---------------------------------------------------------------------- ) end of grouping ----------------------------------------------------------------------
Re: Regex look ahead
by Fletch (Bishop) on Aug 30, 2006 at 20:16 UTC

    Part of the problem might be that there's no capturing parens in that regex; (?:) marks a non-capturing grouping.