Beefy Boxes and Bandwidth Generously Provided by pair Networks
good chemistry is complicated,
and a little bit messy -LW
 
PerlMonks  

Re^3: Using Look-ahead and Look-behind

by AnomalousMonk (Archbishop)
on Jun 25, 2011 at 19:51 UTC ( [id://911397]=note: print w/replies, xml ) Need Help??


in reply to Re^2: Using Look-ahead and Look-behind
in thread Using Look-ahead and Look-behind

Here's a solution that exactly matches the phrases specified in AnonyMonk's Re: Using Look-ahead and Look-behind post (which the code of Re^2: Using Look-ahead and Look-behind does not quite do), and also shows how to use the newfangled backtracking control verbs of 5.10 to emulate variable-width negative look-behind. Variable-width positive look-behind is emulated by 5.10's  \K assertion.

Explanation:

  • Any 'equity' that is preceded by
    • either a character that is not a comma or whitespace, or
    • by the 'private' phrase
    FAILS and is skipped over (this test has first precedence);
  • Otherwise, any 'equity' that is not followed by a comma that is then followed by any non-whitespace SUCCEEDS.

>perl -wMstrict -le "use Test::More 'no_plan'; ;; for my $ar_vector ( [ YES => 'equity, private equity', ], [ YES => 'equity', ], [ no => 'private equity', ], [ YES => 'private equity,equity', ], [ YES => 'private equity, equity', ], [ no => 'equity,private equity', ], [ no => 'private equity', ], [ no => 'mutual funds', ], [ no => 'cds' ], ) { my ($expected, $string) = @$ar_vector; is match($string), $expected, qq{'$string'}; } ;; sub match { my ($string) = @_; ;; my $char_not_comma_or_space = qr{ [^,\s] }xms; my $private = qr{ private \s+ }xms; return 'YES' if $string =~ m{ (?: $char_not_comma_or_space | $private) equity (*SKIP)(*FAIL) | equity (?! , \S) }xms; return 'no', } " ok 1 - 'equity, private equity' ok 2 - 'equity' ok 3 - 'private equity' ok 4 - 'private equity,equity' ok 5 - 'private equity, equity' ok 6 - 'equity,private equity' ok 7 - 'private equity' ok 8 - 'mutual funds' ok 9 - 'cds' 1..9

Replies are listed 'Best First'.
Re^4: Using Look-ahead and Look-behind
by JohnN (Initiate) on Oct 15, 2012 at 15:09 UTC

    I have a dumb question.

    This code works well (THANKS Roy!) when looking for DNA string matches within a genome sequence but not when the * is changed to {50,100}

    e.g.
    /CCGG # Match starting at DNA sequence CCGG ( (?: (?!CCGG) # make sure we're not finding duplicates mid-stream . # accept any character )*? # any number of times BUT not greedily <==== ) AATT # and ending at AATT /x;

    versus

    /CCGG ( (?: (?!CCGG) . ){50,100}? # <==== ) AATT # and ending at AATT /x;

    This latter one does not have dupes of CCGG but does have dupes of AATT. The previous snippet has no dupes of either CCGG or AATT.

    A follow-up: The following code snippet fixes my problem, and I have NO idea why! I tried it out of desperation

    /CCGG ( (?: (?!AATT|CCGG) # <============= . # ){50,100}? # Here the "?" is not required but I'm anal ) # AATT # /x;
      When * is changed to ^, it does not work either. Why are you changing it at all?

      But jokes aside: The *? matches after seeing the first occurence of AATT, so there are no dupes. The {50,100} must match at least 50 times, so if there is AATT after say 25th character, it cannot stop there and must match a larger string.

      Use YAPE::Regex::Explain to see what your regular expresions mean.

      Moreover, you are replying to a node that is not related to your question.

      لսႽ† ᥲᥒ⚪⟊Ⴙᘓᖇ Ꮅᘓᖇ⎱ Ⴙᥲ𝇋ƙᘓᖇ

      Great, please take it to Seekers Of Perl Wisdom, see Re^2: Using Look-ahead and Look-behind

      You forgot to include sample input, no matter, here are clues, run these and compare

      perl -Mre=debug -le " $_ = q/foobarfoodrinkAATT/; /foo((?:(?!bar).){1, +5}?)AATT/; "

      perl -Mre=debug -le " $_ = q/foobarfoodrinkAATTAATT/; /foo((?:(?!bar). +){6,10}?)AATT/; "

      50,100 means match at minimum 50 but no more than 100

      .* means match at least zero times

      in my short example, first AATT appears at 6, so it is included in the match

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://911397]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others admiring the Monastery: (7)
As of 2024-04-18 14:31 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found