Here's a solution that exactly matches the phrases specified in AnonyMonk's Re: Using Look-ahead and Look-behind post (which the code of Re^2: Using Look-ahead and Look-behind does not quite do), and also shows how to use the newfangled backtracking control verbs of 5.10 to emulate variable-width negative look-behind. Variable-width positive look-behind is emulated by 5.10's \K assertion.
Explanation:
-
Any 'equity' that is preceded by
- either a character that is not a comma or whitespace, or
- by the 'private' phrase
FAILS and is skipped over (this test has first precedence);
-
Otherwise, any 'equity' that is not followed by a comma that is then followed by any non-whitespace SUCCEEDS.
>perl -wMstrict -le
"use Test::More 'no_plan';
;;
for my $ar_vector (
[ YES => 'equity, private equity', ],
[ YES => 'equity', ],
[ no => 'private equity', ],
[ YES => 'private equity,equity', ],
[ YES => 'private equity, equity', ],
[ no => 'equity,private equity', ],
[ no => 'private equity', ],
[ no => 'mutual funds', ],
[ no => 'cds' ],
) {
my ($expected, $string) = @$ar_vector;
is match($string), $expected, qq{'$string'};
}
;;
sub match {
my ($string) = @_;
;;
my $char_not_comma_or_space = qr{ [^,\s] }xms;
my $private = qr{ private \s+ }xms;
return 'YES' if $string =~
m{ (?: $char_not_comma_or_space | $private) equity (*SKIP)(*FAIL)
|
equity (?! , \S)
}xms;
return 'no',
}
"
ok 1 - 'equity, private equity'
ok 2 - 'equity'
ok 3 - 'private equity'
ok 4 - 'private equity,equity'
ok 5 - 'private equity, equity'
ok 6 - 'equity,private equity'
ok 7 - 'private equity'
ok 8 - 'mutual funds'
ok 9 - 'cds'
1..9