jarthda has asked for the wisdom of the Perl Monks concerning the following question:

Is there a meta character to NOT match a specific character? I have to strings: dont. don. match don but not dont /don\s/ but is there a way to do it with a not match thingy /don!t/ imagine ! means match anything but t

Replies are listed 'Best First'.
Re: reg ex NOT
by ikegami (Patriarch) on Jun 15, 2006 at 21:48 UTC

    [^t] matches any character but t. It can be modified using ?, + and *, as usual.

    You might also be interested in (?:(?!regexp).)*, which matches a sequence of characters that does not contain anything that matches a specified regexp.

      Remember that using * means to match zero or more times and that ? matches zero or one times, and that you can always match zero times. When you match zero times, you don't prevent anything from not matching:

      print "Matched" if "dont" =~ m/don(?:(?!t).)*/;

      You could anchor it so that nothing after it could match:

      print "Matched" if "dont" =~ m/don(?:(?!t).)*$/;
      It's easier just to type the negated character class for that position, though:
      print "Matched" if "dont" =~ m/don[^t]/;
      --
      brian d foy <brian@stonehenge.com>
      Subscribe to The Perl Review

        Aye, the construct I posted needs to be anchored (but not necessarily using ^ and $). It may be a waste to use that construct for single characters, but it's useful (necessary) for the use I described in my original post (i.e. To matche a sequence of characters that does not contain anything that matches a specified regexp). For example,

        /<table>(?:(?!<\/table>).)*<\/table>/

        (Not the best example, cause tables can be nested in HTML and for other reasons, but you get the idea.)

Re: reg ex NOT
by graff (Chancellor) on Jun 16, 2006 at 00:42 UTC
    In addition to using the "complement character class" notation in the first suggestion above, there is also a thing called a "zero-width negative look-ahead assertion". (A bunch of "zero-width assertions", as well as character classes and everything else, are described in helpful detail in the perlre man page.)

    Note the following subtle difference between using a character-class vs. using a zero-width assertion:

    #!/usr/bin/perl use strict; my $string1 = "don"; # should both of these match? my $string2 = "donk"; # (you be the judge, and choose # your regex accordingly) my %regex = ( complem_char_class => qr/don[^t]/, zwid_neg_lookahead => qr/don(?!t)/, ); for my $regtyp ( sort keys %regex ) { print "\n"; for ( $string1, $string2 ) { my $result = ( /$regex{$regtyp}/ ) ? "succeeds" : "fails"; print "For $_ : match $result based on $regtyp\n"; } }
    The output of that little snippet shows that the character-class regex has to match something (that is, there has to be a character in that position, and the character can be anything other than "t"), so the string "don" (with nothing after "n") won't match.

    The "zero-width" operators allow you to state some condition that needs to be satisfied at a given position in the string, whether or not there happens to be a character present at that position.