Re: Regex negative word
by GrandFather (Saint) on Mar 30, 2006 at 03:34 UTC
|
$_ = "cat dog";
print 'match' if /^(?!cat)/;
print 'no match' unless /^(?!cat)/;
Prints:
no match
DWIM is Perl's answer to Gödel
| [reply] [d/l] [select] |
Re: Regex negative word
by McDarren (Abbot) on Mar 30, 2006 at 03:40 UTC
|
Of course, there are several ways to "skin a cat" (sorry :D)
Update: just acknowledging the fact that (as has been pointed out) this solution doesn't actually work. I won't bother changing the code, but rather leave it there as an example of how _not_ to do it ;)
#!/usr/bin/perl -wl
use strict;
while (<DATA>) {
chomp;
print "$_ matched" if /^[^(?:cat)]/;
}
__DATA__
cat dog
dog cat
frog cat dog
mouse dog cat
cat dog elephant
Prints:
dog cat matched
frog cat dog matched
mouse dog cat matched
Cheers,
Darren :) | [reply] [d/l] [select] |
|
|
use strict;
use warnings;
while (<DATA>) {
print "matched $_" if /^[^cat)(?:]/;
}
__DATA__
cat dog
dog cat
aardvark ant
catnip dogbone
frog cat dog
mouse dog cat
cat dog elephant
can of worms
Prints:
matched dog cat
matched frog cat dog
matched mouse dog cat
The following code excludes beginning cats and insists that dogs end:
use strict;
use warnings;
while (<DATA>) {
print "matched $_" if /^(?!cat\b).*(?=\bdog$)/;
}
__DATA__
cat dog
dog cat
catnip dogbone
aardvark ant
frog cat dog
can of worms
Prints:
matched frog cat dog
DWIM is Perl's answer to Gödel
| [reply] [d/l] [select] |
|
|
McDarren,
your codes fails, because there is no such thing as complex grouping inside character classes (apart form ranges denoted by -). So especially your (?:cat) in there is not a single entity saying "c followed by a followed by t", but refers to the 7 characters it consists of.
Grandfather saw this, but obscured the point in his reply somewhat by reordering that 7 chars. But you can use your code with his __DATA__ and still see that lines beginning with "a" or any "c" (not only that reading "cat") will yield no match!
The way to solve the problem of the OP is to use look-ahead-assertions, as demonstrated by Grandfather in both his replies.
| [reply] [d/l] [select] |
|
|
| [reply] |
Re: Regex negative word
by brian_d_foy (Abbot) on Mar 30, 2006 at 10:38 UTC
|
If you want to find strings that don't match a pattern, you can use the negating binding operator !~. It's nicer looking than complex patterns.
if( $string !~ m/^cat/ ) { ... }
| [reply] [d/l] [select] |
Re: Regex negative word
by zer (Deacon) on Mar 30, 2006 at 03:40 UTC
|
what if you are trying to do it so you can say: if it doesnt begin with cat but ends in dog. does this work in a single statement?
| [reply] |
|
|
# assume $_ holds the value I'm testing
# 1: Two tests
if ( /^(?!cat)/ && /dog$/ ) {} # if does not start with cat && ends wi
+th dog
# 2: take advantage of zero-width assertion in (?!cat)
if ( /^(?!cat).*dog$/ ) {} # if we start with something that is not 'c
+at'
# then have 0 or more chars before ending w
+ith 'dog'
# 3: not using regex at all!
if (
index($_, 'cat') != 0
&& index($_,'dog') == length($_) - 3
) {} # if we don't find 'cat' at the head of the string,
# && find 'dog' at the end.
| [reply] [d/l] |
Re: Regex negative word
by GhodMode (Pilgrim) on Mar 30, 2006 at 11:54 UTC
|
/^(?!cat)\s+dog/
This is called "A zero-width negative lookahead assertion." ... See why so many people like Perl for obfuscation :)
Ref: perl.com
--
-- GhodMode
Blessed is he who has found his work; let him ask no other blessedness.
-- Thomas Carlyle
| [reply] [d/l] |
|
|
Note that your example is not entirely sensible, though. The expression you give must begin with whitespace, so of course it doesn't begin cat. You have to remember that the lookahead is zero-width, so after it, you're in the same position in the string as before it. Perhaps something like
/^(?!cat).*\sdog/
as the most general case to consume the non-cat, before-the-dog portion of the string.
Caution: Contents may have been coded under pressure.
| [reply] [d/l] |
|
|
You are absolutely right! But, rereading the question, the whitespace and the dog aren't necessary, either. zer just asked for a regex that matches a string which does not begin with cat. So, I think /^(?!cat)/ should do it. That would match any line with a beginning which is not followed by a cat, right?
--
-- GhodMode
Blessed is he who has found his work; let him ask no other blessedness.
-- Thomas Carlyle
| [reply] |
|
|
Re: Regex negative word
by unobe (Scribe) on Apr 02, 2006 at 04:15 UTC
|
Just for fun, I decided to try using (?(cond)ptrue[|pfalse]) , a regexp feature I learned just yesterday while reading the Perl Pocket Reference. However, reading perldoc perlre, I noted something the Perl Pocket Reference didn't: this feature is "highly experimental". That's probably why it wasn't used by anyone in the first place. Since this isn't in the first place, no harm done. :-)
/^(?(?!cat).*\bdog)$/
| [reply] [d/l] [select] |