jxh has asked for the wisdom of the Perl Monks concerning the following question:

Hello. The following gives me three hits, however, how do I 'smartly' negate this pattern ? ie hit on everything except what Im currently getting... ( without using code logic to achieve this ) Im getting stuck specifying 'not' - OK with a caracter class, ( ie ^some chars ) - but not when using parenthesis to define multiple 'words'. I assume Ill be changing the 'or' to 'and' within the word list but cant seem to crack it.
#!/usr/bin/perl -w use strict; while ( <DATA> ) { s/(.*ing)(bob|fred|bill)(more)/\1HIT\3/; print $_ . "\n"; } __DATA__ somestringbobmoremore somestringfredmoremore somestringbillmoremore somestringtedmore

Replies are listed 'Best First'.
Re: Negating a regexp
by ikegami (Patriarch) on Jan 31, 2006 at 22:27 UTC

    /(?:(?!$re).)*/
    is the equivalent of
    /[^$chars]*/
    but it can (negatively) match entire regexps instead of a choice of characters. Keep in mind that both expressions can sucessfully match 0 characters if not properly anchored.

    s/(.*ing)(?:(?!bob|fred|bill).)*(more)/\1HIT\2/;

    Update:

    You may want
    "something that isn't /bob|fred|bill/"
    rather than
    "something that doesn't contain /bob|fred|bill/".

    The code for that would be:

    s/ (.*ing) (?: .{0,2} | (?!bob).{3} | (?!fred|bill).{4} | .{5,} ) (more) /\1HIT\2/x;

    or

    my %bad_words = map { $_ => 1 } qw( bob fred bill ); s/ (.*ing) (.*) (?(?{ $bad_words{$2} })\A(?!\A)) (more) /\1HIT\3/x;

    You could simplify the above to

    s/(.*ing)(?!bob|fred|bill).*(more)/\1HIT\2/;

    but only if the start of more can't match the end of bob, fred or bill.

Re: Negating a regexp
by GrandFather (Saint) on Jan 31, 2006 at 22:28 UTC

    It's negative look ahead assertion time again:

    #!/usr/bin/perl -w use strict; while (<DATA>) { s/(.*ing)((?!bob|fred|bill)\w+?)(more)/$1HIT$3/; print $_; } __DATA__ somestringbobmoremore somestringfredmoremore somestringbillmoremore somestringtedmore

    Prints:

    somestringbobmoremore somestringfredmoremore somestringbillmoremore somestringHITmore

    DWIM is Perl's answer to Gödel
      Not quite. I have a feeling more is not actually more. If so, this may fail. For example, if more is really less,
      somestringbilless
      doesn't get changed to
      somestringHITless.

        Modified regex to accomodate bil:

        #!/usr/bin/perl -w use strict; while (<DATA>) { s/(.*ing)((?!bob|fred|bill(?!ess))\w+?)(more|less)/$1HIT$3/; print $_; } __DATA__ somestringbobmoremore somestringfredmoremore somestringbillmoremore somestringtedmore somestringbilless

        Prints:

        somestringbobmoremore somestringfredmoremore somestringbillmoremore somestringHITmore somestringHITless

        DWIM is Perl's answer to Gödel

        It is just what OP asked for. If OP wanted to match more or less then:

        s/(.*ing)((?!bob|fred|bill)\w+?)(more|less)/$1HIT$3/

        does the trick.

        Note that OP says:

        Im getting stuck specifying 'not' - OK with a caracter class, ( ie ^some chars ) - but not when using parenthesis to define multiple 'words'

        DWIM is Perl's answer to Gödel