bbs2web has asked for the wisdom of the Perl Monks concerning the following question:

I appear to miss understand the negated look ahead command. I'm trying to remove all lines that don't start with a certain pattern:

Validating that my regex matches:
[admin@kvm5a ~]# rbd showmapped | perl -pe 's/^\d+\s+.*\n//g' id pool image snap device

When trying to use the negated look ahead (?!) it doesn't remove the header:
[admin@kvm5a ~]# rbd showmapped | perl -pe 's/^?!\d+\s+.*\n//g' id pool image snap device 0 rbd_hdd vm-209-disk-1 - /dev/rbd0 1 rbd_hdd vm-107-disk-1 - /dev/rbd1 10 rbd_hdd vm-144-disk-1 - /dev/rbd10 11 rbd_hdd vm-145-disk-1 - /dev/rbd11 12 rbd_hdd vm-151-disk-1 - /dev/rbd12 13 rbd_hdd vm-154-disk-1 - /dev/rbd13 14 rbd_hdd vm-170-disk-1 - /dev/rbd14 15 rbd_hdd vm-204-disk-1 - /dev/rbd15 16 rbd_nvme vm-100-disk-1 - /dev/rbd16 17 rbd_hdd vm-171-disk-1 - /dev/rbd17 18 rbd_hdd vm-291-disk-1 - /dev/rbd18 19 rbd_hdd vm-285-disk-1 - /dev/rbd19 2 rbd_nvme vm-101-disk-1 - /dev/rbd2 20 rbd_nvme vm-212-disk-1 - /dev/rbd20 21 rbd_nvme vm-211-disk-1 - /dev/rbd21 22 rbd_nvme vm-212-disk-2 - /dev/rbd22 23 rbd_nvme vm-211-disk-2 - /dev/rbd23 24 rbd_nvme vm-212-disk-3 - /dev/rbd24 25 rbd_nvme vm-211-disk-3 - /dev/rbd25 26 rbd_hdd vm-300-disk-1 - /dev/rbd26 3 rbd_hdd vm-108-disk-1 - /dev/rbd3 4 rbd_hdd vm-104-disk-1 - /dev/rbd4 5 rbd_hdd vm-109-disk-1 - /dev/rbd5 6 rbd_hdd vm-123-disk-1 - /dev/rbd6 7 rbd_hdd vm-138-disk-1 - /dev/rbd7 8 rbd_hdd vm-141-disk-1 - /dev/rbd8 9 rbd_hdd vm-143-disk-1 - /dev/rbd9

Replies are listed 'Best First'.
Re: Remove all lines, except those starting with pattern
by choroba (Cardinal) on Aug 09, 2018 at 12:11 UTC
    Two problems:
    1. The parentheses around (?!...) aren't optional.
    2. Look-around assertions are zero-width, i.e. if they match, they don't consume any part of the string. It means replacing them doesn't remove anything.
    To remove the first line, you'd be better off with
    perl -ne 'print if 1 != $.'

    $. is the current line number of the last accessed file handle.

    ($q=q:Sq=~/;[c](.)(.)/;chr(-||-|5+lengthSq)`"S|oS2"`map{chr |+ord }map{substrSq`S_+|`|}3E|-|`7**2-3:)=~y+S|`+$1,++print+eval$q,q,a,
Re: Remove all lines, except those starting with pattern
by Eily (Monsignor) on Aug 09, 2018 at 12:19 UTC

    The parentheses aren't optional. ^? means ^ (beginning of the string) 0 or 1 time; this will match if you are either at the beginning of the string, or if you are not... meaning everywhere. So your pattern actually is /!\d+\s+.*\n/ which is clearly not what you wanted.

    While it's possible to do what you want with the -p option (read and print for each line, with possible modification), the -n option probably fits your problem better (read, but only print when you explictly ask for it), your one-liner becomes: perl -ne "print unless /^\d+\s+.*\n/"

    Edit: or print if /regex/ if you want to keep only the lines with the pattern. I misread the title (didn't realise that "remove" + "except" works like a double negation, so print only the lines with the pattern). Thanks johngg

Re: Remove all lines, except those starting with pattern
by tybalt89 (Monsignor) on Aug 09, 2018 at 14:30 UTC

    Close, just missing parens.

    s/^(?!\d+\s+).*\n//
Re: Remove all lines, except those starting with pattern
by kcott (Archbishop) on Aug 10, 2018 at 07:44 UTC

    G'day bbs2web,

    There may be more to your data than you show; however, from what's in your OP, you may be overthinking the solution.

    A cut-down version of your data:

    $ cat fred id pool image snap device 0 rbd_hdd vm-209-disk-1 - /dev/rbd0 1 rbd_hdd vm-107-disk-1 - /dev/rbd1 10 rbd_hdd vm-144-disk-1 - /dev/rbd10

    Removing ... except:

    $ cat fred | perl -ne 'print unless /^[^\d]/' 0 rbd_hdd vm-209-disk-1 - /dev/rbd0 1 rbd_hdd vm-107-disk-1 - /dev/rbd1 10 rbd_hdd vm-144-disk-1 - /dev/rbd10

    Which is basically a double-negative for:

    $ cat fred | perl -ne 'print if /^\d/' 0 rbd_hdd vm-209-disk-1 - /dev/rbd0 1 rbd_hdd vm-107-disk-1 - /dev/rbd1 10 rbd_hdd vm-144-disk-1 - /dev/rbd10

    — Ken

Re: Remove all lines, except those starting with pattern
by perlygapes (Beadle) on Aug 12, 2018 at 10:54 UTC
    Hi bbs2web,

    Think of positive/negative look-ahead/look-behind regular expression patterns like this:

    Positive Lookahead:
    (PATTERN_OF_INTEREST)(?=FOLLOWING_PATTERN)
    Capture PATTERN_OF_INTEREST when it precedes the FOLLOWING_PATTERN

    e.g.    'this(?=\ that)' # match 'this' when followed by ' that')

    Negative Lookahead:
    (PATTERN_OF_INTEREST)(?!FOLLOWING_PATTERN)
    Capture PATTERN_OF_INTEREST when it DOES NOT precede the FOLLOWING_PATTERN

    e.g.    'this(?!\ that)' # match 'this' when NOT followed by ' that'

    Positive Look-behind:
    (?<=PRECEDING_PATTERN)(PATTERN_OF_INTEREST)
    Capture PATTERN_OF_INTEREST when it follows the PRECEDING_PATTERN

    e.g.    '(?<=this\ )that' # match 'that' when preceded by 'this '

    Negative Look-behind:
    (?<!PRECEDING_PATTERN)(PATTERN_OF_INTEREST)
    Capture PATTERN_OF_INTEREST when it DOES NOT follow the PRECEDING_PATTERN

    e.g.    '(?<!this\ )that' # match 'that' when NOT preceded by 'this '

    Both the PRECEDING_PATTERN and the FOLLOWING_PATTERN are matched as the 'look-around' patterns, but neither are captured to a variable, only what is matched as part of your (PATTERN_OF_INTEREST) capture group (i.e. if you use capturing parenthesis).

    Edit: replaced 'followed' with 'preceded' as correctly pointed out by AnomalousMonk. Thanks for spotting that :) It was late...

      Note that in the comments of the code examples in the Positive and Negative Look-behind sections, e.g.
          '(?<=this\ )that' # match 'that' when followed by 'this '
          '(?<!this\ )that' # match 'that' when NOT followed by 'this '
      the word "followed" should be "preceded" in both cases, i.e.
          '(?<=this\ )that' # match 'that' when preceded by 'this '
          '(?<!this\ )that' # match 'that' when NOT preceded by 'this '
      respectively.

      Both the PRECEDING_PATTERN and the FOLLOWING_PATTERN are matched as the 'look-around' patterns, but neither are captured to a variable, only what is matched as part of your (PATTERN_OF_INTEREST) capture group ...

      As an interesting (I hope) side note, a capture group embedded within a positive lookaround assertion will capture something:

      c:\@Work\Perl\monks>perl -wMstrict -le "my $s = 'foobar'; ;; print qq{positive look-behind capture: '$1'} if $s =~ m{ (?<= (foo)) + bar }xms; print qq{positive look-ahead capture: '$1'} if $s =~ m{ foo (? += (bar)) }xms; " positive look-behind capture: 'foo' positive look-ahead capture: 'bar'

      An embedded positive look-ahead capture is a way to capture and extract certain overlapping matches:

      c:\@Work\Perl\monks>perl -wMstrict -MData::Dump -le "my $s = 'abcd efgh'; ;; my @caps = $s =~ m{ \w+ }xmsg; dd 'non-overlapping captures: ', \@caps; ;; @caps = $s =~ m{ (?= (\w+)) }xmsg; dd 'overlapping captures: ', \@caps; " ("non-overlapping captures: ", ["abcd", "efgh"]) ( "overlapping captures: ", ["abcd", "bcd", "cd", "d", "efgh", "fgh", "gh", "h"], )

      Update: Slight, essentially trivial wording and emphasis changes.


      Give a man a fish:  <%-{-{-{-<

        ...interesting indeed, a double negative :)...switching off the anti-capture could prove useful at some point...(but I must always remember I will forget what I did and why I did it so I have to keep it simple for my dumb future self :P