Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

Hi Monks.

Strawberry Perl (5.24.0) on a Windows XP sp3 machine, the following code:

use strict; use warnings; use v5.24; my $s = "dogcat 42 birdfish"; if ($s =~ /(?<=dog).*\d+(?=fish)/sgxi) { say "Ah! The meaning of life!\n\n"; }


...doesn't appear to be matching the 42 with the pattern provided unless I change the number to a string, e.g., 'bird'. I also attempted to use capturing parentheses, but the result was the same.

Thank you monks.

Sarah

Replies are listed 'Best First'.
Re: Positive lookbehind and lookahead confusion
by Corion (Patriarch) on Nov 27, 2016 at 09:46 UTC

    It cannot match because the pattern expects \d+fish , which is not present in your string.

    Maybe you want an additional .* after the \d+ ?

      Thank you Corion. That was slightly embarrassing. :)
Re: Positive lookbehind and lookahead confusion
by Athanasius (Archbishop) on Nov 27, 2016 at 14:34 UTC

    Hello Sarah,

    This may sound like heresy, but actually you don’t need any lookaround assertions here. Building on the insights from other monks above, but removing the lookarounds and instead adding capturing parentheses:

    0:24 >perl -wE "my $s = qq[dogcat\n42\nbirdfish]; say qq[\nAh! The me +aning of life is $1!] if $s =~ /dog.*?(\d+).*fish/s;" Ah! The meaning of life is 42! 0:25 >

    Lookarounds are documented in perlre#Extended-Patterns. In particular:

    Lookaround assertions are zero-width patterns which match a specific pattern without including it in $&. (Emphasis added)

    Unless you need to exclude the pattern from $&, you probably don’t need positive lookaround assertions. (Negative lookarounds are a different matter.)

    Hope that’s of interest,

    Athanasius <°(((><contra mundum Iustus alius egestas vitae, eros Piratica,

Re: Positive lookbehind and lookahead confusion
by kcott (Archbishop) on Nov 27, 2016 at 10:07 UTC

    G'day Sarah,

    There's nothing wrong with either assertion. The apparent problem occurs because the first '.*' matches 'cat' and all the whitespace that follows it but nothing matches 'bird' and all the whitespace that precedes it.

    Here's a simplified example of what you're currently doing:

    $ perl -E 'say("X\n\t99\n\tY" =~ /(?<=X).*\d+(?=Y)/s ? 1 : 0)' 0

    Here's how you might fix that:

    $ perl -E 'say("X\n\t99\n\tY" =~ /(?<=X).*\d+.*(?=Y)/s ? 1 : 0)' 1

    I'd also suggest you take a look at "perlre: Modifiers": note I only used 's'; the 'gxi' are not needed.

    — Ken

Re: Positive lookbehind and lookahead confusion
by AnomalousMonk (Archbishop) on Nov 27, 2016 at 13:57 UTC

    In addition to those pointed out by Corion and kcott, there's another little problem with the OPed regex. The first  .* assertion "consumes" all but one of the  \d digits available for  \d+ to match, giving a wrong answer to the question of Life, the Universe and Everything:

    c:\@Work\Perl\monks>perl -wMstrict -MData::Dump -le "my $s = qq{DoGcat\n 42\n birdfIsH}; dd $s; print qq{<<$s>>}; ;; print qq{LUE == '$1'} if $s =~ /(?<=dog) .* (\d+) .* (?=fish)/sxi; " "DoGcat\n 42\n birdfIsH" <<DoGcat 42 birdfIsH>> LUE == '2'
    This problem could be solved by either making the first  .* "lazy", as I like to call it:
    c:\@Work\Perl\monks>perl -wMstrict -MData::Dump -le "my $s = qq{DoGcat\n 42\n birdfIsH}; dd $s; print qq{<<$s>>}; ;; print qq{LUE == '$1'} if $s =~ /(?<=dog) .*? (\d+) .* (?=fish)/sxi; " "DoGcat\n 42\n birdfIsH" <<DoGcat 42 birdfIsH>> LUE == '42'
    or by being more specific about what this group should match:
    c:\@Work\Perl\monks>perl -wMstrict -MData::Dump -le "my $s = qq{DoGcat\n 42\n birdfIsH}; dd $s; print qq{<<$s>>}; ;; print qq{LUE == '$1'} if $s =~ /(?<=dog) \D* (\d+) .* (?=fish)/sxi; " "DoGcat\n 42\n birdfIsH" <<DoGcat 42 birdfIsH>> LUE == '42'
    (I'm also not using the useless  /g modifier.) (Update: Of course, none of this matters if you don't really care what the answer is, only that there is an answer! :)


    Give a man a fish:  <%-{-{-{-<