Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

I am trying to build a regex that will match
"cat" but not match "housecat".

I know how to exclude characters with [^], but how do you
exclude strings?

This does not work: /[^(house)]cat/

Thanks..

Replies are listed 'Best First'.
Re: excluding strings in regex
by Ovid (Cardinal) on Apr 09, 2004 at 22:57 UTC

    You can use a zero-width negative lookbehind assertion:

    perl -le 'foreach (qw/housecat fatcat/) {print "$_ matches" if /(?<!house)cat/}'

    Cheers,
    Ovid

    New address of my CGI Course.

Re: excluding strings in regex
by Anomynous Monk (Scribe) on Apr 09, 2004 at 23:45 UTC
    In general, if you want to match a word, use the word boundary assertions: /\bcat\b/.

    If you specifically want to exclude "housecat" and "polecat" but match "Yucatan" or anything else that has "cat", use a negative lookbehind assertion: /(?<!pole)(?<!house)cat/. (The regex that goes in (?<! ) or (?<= ) must be a fixed width, so "pole" and "house" need separate assertions; however "wild" could be combined with "pole" as /(?<!wild|pole)cat/.

Re: excluding strings in regex
by water (Deacon) on Apr 10, 2004 at 02:17 UTC
    The original post doesn't really say why "cat" is OK and "housecat" isn't (could be word boundary issues, length issues, hair on the sofa issues, whatever). So it is hard to give a precise answer. That said, just wanted to point out if your situation allows multiple regexps, sometimes things get simpler:
    if (/cat/ and ! /housecat/) {... # or if (/cat\b/ and ! /housecat\b/) {... # etc etc
    Just stating the obvious: often times no need to cram everything into one regexp, when using two can make things simpler.

    water