Here's a solution that exactly matches the phrases specified in AnonyMonk's Re: Using Look-ahead and Look-behind post (which the code of Re^2: Using Look-ahead and Look-behind does not quite do), and also shows how to use the newfangled backtracking control verbs of 5.10 to emulate variable-width negative look-behind. Variable-width positive look-behind is emulated by 5.10's \K assertion.
Explanation:
-
Any 'equity' that is preceded by
- either a character that is not a comma or whitespace, or
- by the 'private' phrase
FAILS and is skipped over (this test has first precedence);
-
Otherwise, any 'equity' that is not followed by a comma that is then followed by any non-whitespace SUCCEEDS.
>perl -wMstrict -le
"use Test::More 'no_plan';
;;
for my $ar_vector (
[ YES => 'equity, private equity', ],
[ YES => 'equity', ],
[ no => 'private equity', ],
[ YES => 'private equity,equity', ],
[ YES => 'private equity, equity', ],
[ no => 'equity,private equity', ],
[ no => 'private equity', ],
[ no => 'mutual funds', ],
[ no => 'cds' ],
) {
my ($expected, $string) = @$ar_vector;
is match($string), $expected, qq{'$string'};
}
;;
sub match {
my ($string) = @_;
;;
my $char_not_comma_or_space = qr{ [^,\s] }xms;
my $private = qr{ private \s+ }xms;
return 'YES' if $string =~
m{ (?: $char_not_comma_or_space | $private) equity (*SKIP)(*FAIL)
|
equity (?! , \S)
}xms;
return 'no',
}
"
ok 1 - 'equity, private equity'
ok 2 - 'equity'
ok 3 - 'private equity'
ok 4 - 'private equity,equity'
ok 5 - 'private equity, equity'
ok 6 - 'equity,private equity'
ok 7 - 'private equity'
ok 8 - 'mutual funds'
ok 9 - 'cds'
1..9
-
Are you posting in the right place? Check out Where do I post X? to know for sure.
-
Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
<code> <a> <b> <big>
<blockquote> <br /> <dd>
<dl> <dt> <em> <font>
<h1> <h2> <h3> <h4>
<h5> <h6> <hr /> <i>
<li> <nbsp> <ol> <p>
<small> <strike> <strong>
<sub> <sup> <table>
<td> <th> <tr> <tt>
<u> <ul>
-
Snippets of code should be wrapped in
<code> tags not
<pre> tags. In fact, <pre>
tags should generally be avoided. If they must
be used, extreme care should be
taken to ensure that their contents do not
have long lines (<70 chars), in order to prevent
horizontal scrolling (and possible janitor
intervention).
-
Want more info? How to link
or How to display code and escape characters
are good places to start.