Beefy Boxes and Bandwidth Generously Provided by pair Networks
The stupid question is the question not asked
 
PerlMonks  

Re^4: Using Look-ahead and Look-behind

by JohnN (Initiate)
on Oct 15, 2012 at 15:09 UTC ( [id://999100]=note: print w/replies, xml ) Need Help??


in reply to Re^3: Using Look-ahead and Look-behind
in thread Using Look-ahead and Look-behind

I have a dumb question.

This code works well (THANKS Roy!) when looking for DNA string matches within a genome sequence but not when the * is changed to {50,100}

e.g.
/CCGG # Match starting at DNA sequence CCGG ( (?: (?!CCGG) # make sure we're not finding duplicates mid-stream . # accept any character )*? # any number of times BUT not greedily <==== ) AATT # and ending at AATT /x;

versus

/CCGG ( (?: (?!CCGG) . ){50,100}? # <==== ) AATT # and ending at AATT /x;

This latter one does not have dupes of CCGG but does have dupes of AATT. The previous snippet has no dupes of either CCGG or AATT.

A follow-up: The following code snippet fixes my problem, and I have NO idea why! I tried it out of desperation

/CCGG ( (?: (?!AATT|CCGG) # <============= . # ){50,100}? # Here the "?" is not required but I'm anal ) # AATT # /x;

Replies are listed 'Best First'.
Re^5: Using Look-ahead and Look-behind
by choroba (Cardinal) on Oct 15, 2012 at 15:25 UTC
    When * is changed to ^, it does not work either. Why are you changing it at all?

    But jokes aside: The *? matches after seeing the first occurence of AATT, so there are no dupes. The {50,100} must match at least 50 times, so if there is AATT after say 25th character, it cannot stop there and must match a larger string.

    Use YAPE::Regex::Explain to see what your regular expresions mean.

    Moreover, you are replying to a node that is not related to your question.

    لսႽ† ᥲᥒ⚪⟊Ⴙᘓᖇ Ꮅᘓᖇ⎱ Ⴙᥲ𝇋ƙᘓᖇ
Re^5: Using Look-ahead and Look-behind
by Anonymous Monk on Oct 15, 2012 at 15:28 UTC

    Great, please take it to Seekers Of Perl Wisdom, see Re^2: Using Look-ahead and Look-behind

    You forgot to include sample input, no matter, here are clues, run these and compare

    perl -Mre=debug -le " $_ = q/foobarfoodrinkAATT/; /foo((?:(?!bar).){1, +5}?)AATT/; "

    perl -Mre=debug -le " $_ = q/foobarfoodrinkAATTAATT/; /foo((?:(?!bar). +){6,10}?)AATT/; "

    50,100 means match at minimum 50 but no more than 100

    .* means match at least zero times

    in my short example, first AATT appears at 6, so it is included in the match

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://999100]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others cooling their heels in the Monastery: (6)
As of 2024-03-28 21:23 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found