ryanUK has asked for the wisdom of the Perl Monks concerning the following question:

Hi all, I am sorry if such questions were posted before but I would be grateful if you guys could help me out on this! I want to match the word "active" unless its preceded by "mild" or "moderate". I tried, $variable = qr/(?!(mild|moderate)).?active/ but this seems to match everything - i.e., "mild active", "moderate active" and also "active". I don't quite understand why this isn't working. Please help! Ryan

Replies are listed 'Best First'.
Re: Conditional matching into qr variable
by rnewsham (Curate) on May 29, 2013 at 10:58 UTC

    You can use multiple negative lookbehinds to do what you want. Although I imagine fellow monks will probably have a more elegant solution.

    use strict; use warnings; for ( <DATA> ) { my $variable = qr/(?<!mild\s)(?<!moderate\s)active/; print $_ if ( m/$variable/ ); } __DATA__ foo active mild active moderate active active

    Outputs

    foo active active
      Dear Perl Monks - Thank you so much for your help!!!
Re: Conditional matching into qr variable
by BrowserUk (Patriarch) on May 29, 2013 at 10:50 UTC

    The problem is that you have .? after the negative lookahead, which means the regex can: NOT match either of your two words with 'ild ' or 'oderate ', then ignore the . because it is optional and then match 'active'.

    Now, you might think that just removing the ? would force it to work by matching the space (I did initially), but: when the preceding word is 'mild', it can match that by NOT matching the word moderate; and vice versa, so still every instance of active will match.

    This works (though it will also (for example) not match 'elaborate active' should that appear):

    my $re = qr[(?<!mild|rate)\s+active];

    Negative lookbehinds have to be fixed length (hence the truncation of moderate).

    Update: rather than risking mismatches caused by the truncation of the longer word, you could pad the shorter word:

    my $re = qr[(?<!....mild|moderate)\s+active];

    if you think that will be less likely to cause mismatches.


    With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority".
    In the absence of evidence, opinion is indistinguishable from prejudice.
Re: Conditional matching into qr variable
by choroba (Cardinal) on May 29, 2013 at 10:52 UTC
    You used (?!, which is a look-ahead. You needed a look-behind. Unfortunately, variable length look-behind is not implemented (and "moderate" is longer than "mild"). So, you have to reverse your string and use a negative variable length look-ahead:
    #!/usr/bin/perl use warnings; use strict; use feature qw(say); my @w = ('active', 'not active', 'mild active', 'moderate active'); for (@w) { my $r = reverse; say "$_: ", $r =~ /evitca(?! dlim| etaredom)/ ? 1 : 0, '.'; }
    لսႽ† ᥲᥒ⚪⟊Ⴙᘓᖇ Ꮅᘓᖇ⎱ Ⴙᥲ𝇋ƙᘓᖇ
Re: Conditional matching into qr variable
by RichardK (Parson) on May 29, 2013 at 10:56 UTC

    If you have another look at perlre you'll see that ?!(mild)) is a look ahead assertion and you need look behind ?<!(mild).

    But, at least on my version of perl 5.16 the docs say

    "(?<!pattern)" A zero-width negative look-behind assertion. For examp +le "/(?<!bar)foo/" matches any occurrence of "foo" that does not follow "bar". Works +only for fixed-width look-behind.

    So you probably need to try a different approach, Try doing your test in 2 phases 1. match lines with active and capture the previous word then 2. skip any line where that word is 'mild' or 'moderate'.

    If you show a few example lines of your data then someone here may have a better suggestion.