b4swine has asked for the wisdom of the Perl Monks concerning the following question:

I want to regex to find the word 'cat' which are not immediately (with perhaps whitespace) followed by the word 'dog', so that it should match 'we have catnip for sale' and 'my cat eats dogs' but not match 'I sell hot cat dogs'.

The regex I could come up with was: /cat *([^d]|[^ ][^o]|[^ ].[^g])/

But for a longer 'dog', this would be a pain. Looking for a better solution.

Replies are listed 'Best First'.
Re: search for 'cat' not followed by 'dog'
by haukex (Archbishop) on Jan 28, 2019 at 19:18 UTC
Re: search for 'cat' not followed by 'dog'
by stevieb (Canon) on Jan 28, 2019 at 19:24 UTC

    You want a "negative lookahead":

    use warnings; use strict; while(<DATA>){ print if / cat # word cat (?: # do not capture (?! # negative lookahead \s* # possible whitespace dog # the word dog ) # end negative lookahead ) # end do not capture group /x; # /cat(?:(?!\s*dog))/ } __DATA__ my cat loves dogs catdog cat dog dog dog dog cat dog dog cat not a dog

    Output:

    my cat loves dogs dog cat not a dog

    Note that the above will also match things like cat doggle, so if you *only* want it to match the word "dog", you'll have to put a boundary at the end of that word in the regex.

    Update: AnomalousMonk pointed out below that the non-capture portion is not required here. For posterity, I'll leave it in (unless someone advises it'd be best to remove). So essentially, the second and last lines in the regex above are not needed.

      # /cat(?:(?!\s*dog))/

      I don't understand why you wrap  (?!\s*dog) in a non-capturing group.


      Give a man a fish:  <%-{-{-{-<

        Old habit that I write in a non-capture when I'm doing grouping. In this case, after some *quick* testing, I realize that the negative lookahead doesn't appear to capture by default. In fact, even with benchmarking, the non-cap is effectively no more efficient than leaving it out.