in reply to Remove all lines, except those starting with pattern

Hi bbs2web,

Think of positive/negative look-ahead/look-behind regular expression patterns like this:

Positive Lookahead:
(PATTERN_OF_INTEREST)(?=FOLLOWING_PATTERN)
Capture PATTERN_OF_INTEREST when it precedes the FOLLOWING_PATTERN

e.g.    'this(?=\ that)' # match 'this' when followed by ' that')

Negative Lookahead:
(PATTERN_OF_INTEREST)(?!FOLLOWING_PATTERN)
Capture PATTERN_OF_INTEREST when it DOES NOT precede the FOLLOWING_PATTERN

e.g.    'this(?!\ that)' # match 'this' when NOT followed by ' that'

Positive Look-behind:
(?<=PRECEDING_PATTERN)(PATTERN_OF_INTEREST)
Capture PATTERN_OF_INTEREST when it follows the PRECEDING_PATTERN

e.g.    '(?<=this\ )that' # match 'that' when preceded by 'this '

Negative Look-behind:
(?<!PRECEDING_PATTERN)(PATTERN_OF_INTEREST)
Capture PATTERN_OF_INTEREST when it DOES NOT follow the PRECEDING_PATTERN

e.g.    '(?<!this\ )that' # match 'that' when NOT preceded by 'this '

Both the PRECEDING_PATTERN and the FOLLOWING_PATTERN are matched as the 'look-around' patterns, but neither are captured to a variable, only what is matched as part of your (PATTERN_OF_INTEREST) capture group (i.e. if you use capturing parenthesis).

Edit: replaced 'followed' with 'preceded' as correctly pointed out by AnomalousMonk. Thanks for spotting that :) It was late...

Replies are listed 'Best First'.
Re^2: Remove all lines, except those starting with pattern
by AnomalousMonk (Archbishop) on Aug 13, 2018 at 16:53 UTC

    Note that in the comments of the code examples in the Positive and Negative Look-behind sections, e.g.
        '(?<=this\ )that' # match 'that' when followed by 'this '
        '(?<!this\ )that' # match 'that' when NOT followed by 'this '
    the word "followed" should be "preceded" in both cases, i.e.
        '(?<=this\ )that' # match 'that' when preceded by 'this '
        '(?<!this\ )that' # match 'that' when NOT preceded by 'this '
    respectively.

    Both the PRECEDING_PATTERN and the FOLLOWING_PATTERN are matched as the 'look-around' patterns, but neither are captured to a variable, only what is matched as part of your (PATTERN_OF_INTEREST) capture group ...

    As an interesting (I hope) side note, a capture group embedded within a positive lookaround assertion will capture something:

    c:\@Work\Perl\monks>perl -wMstrict -le "my $s = 'foobar'; ;; print qq{positive look-behind capture: '$1'} if $s =~ m{ (?<= (foo)) + bar }xms; print qq{positive look-ahead capture: '$1'} if $s =~ m{ foo (? += (bar)) }xms; " positive look-behind capture: 'foo' positive look-ahead capture: 'bar'

    An embedded positive look-ahead capture is a way to capture and extract certain overlapping matches:

    c:\@Work\Perl\monks>perl -wMstrict -MData::Dump -le "my $s = 'abcd efgh'; ;; my @caps = $s =~ m{ \w+ }xmsg; dd 'non-overlapping captures: ', \@caps; ;; @caps = $s =~ m{ (?= (\w+)) }xmsg; dd 'overlapping captures: ', \@caps; " ("non-overlapping captures: ", ["abcd", "efgh"]) ( "overlapping captures: ", ["abcd", "bcd", "cd", "d", "efgh", "fgh", "gh", "h"], )

    Update: Slight, essentially trivial wording and emphasis changes.


    Give a man a fish:  <%-{-{-{-<

      ...interesting indeed, a double negative :)...switching off the anti-capture could prove useful at some point...(but I must always remember I will forget what I did and why I did it so I have to keep it simple for my dumb future self :P