Beefy Boxes and Bandwidth Generously Provided by pair Networks
"be consistent"
 
PerlMonks  

comment on

( [id://3333]=superdoc: print w/replies, xml ) Need Help??

I think the documentation is a little misleading here. At least, it gives me the impression that the first match (if any) is somehow guaranteed to be valid (because codon-aligned). But that’s true only if, as in the example given, the $dna string happens to contain a valid match somewhere — in which case, it will be found first. But if it doesn’t, the first match is an invalid one:

#! perl use strict; use warnings; while (my $dna = <DATA>) { chomp $dna; print "\n\$dna = '$dna'\n"; while ($dna =~ /(\w\w\w)*?TGA/g) { print 'Got a TGA stop codon at position ', pos $dna, ', immediately following [', $1, "]\n"; } } __DATA__ ATCGTTGAA ATCGTTGAATGCAAATGACATGAC

Output:

0:10 >perl 1476_SoPW.pl $dna = 'ATCGTTGAA' Got a TGA stop codon at position 8, immediately following [CGT] $dna = 'ATCGTTGAATGCAAATGACATGAC' Got a TGA stop codon at position 18, immediately following [AAA] Use of uninitialized value $1 in print at 1476_SoPW.pl line 43, <DATA> + line 2. Got a TGA stop codon at position 23, immediately following [] 0:10 >

Adding a \G anchor to the regex:

while ($dna =~ /\G(\w\w\w)*?TGA/g)

fixes the results for both dna strings, because \G means Match only at pos() (e.g. at the end-of-match position of prior m//g) (see “Assertions” in perlre), and initially pos() is set at zero.

<Begin update> choroba is of course correct, anchoring to the start of the string finds only the first match.

But that means that the regex could also be fixed without recourse to \G, by simply anchoring it to the start of the string:

while ($dna =~ /^(\w\w\w)*?TGA/g)

<End update>

Perhaps not Perl documentation’s finest hour. :-)

Hope that helps,

Athanasius <°(((><contra mundum Iustus alius egestas vitae, eros Piratica,


In reply to Re: Understanding a portion of perlretut by Athanasius
in thread Understanding a portion on the Perlretut by BlueStarry

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post; it's "PerlMonks-approved HTML":



  • Are you posting in the right place? Check out Where do I post X? to know for sure.
  • Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
    <code> <a> <b> <big> <blockquote> <br /> <dd> <dl> <dt> <em> <font> <h1> <h2> <h3> <h4> <h5> <h6> <hr /> <i> <li> <nbsp> <ol> <p> <small> <strike> <strong> <sub> <sup> <table> <td> <th> <tr> <tt> <u> <ul>
  • Snippets of code should be wrapped in <code> tags not <pre> tags. In fact, <pre> tags should generally be avoided. If they must be used, extreme care should be taken to ensure that their contents do not have long lines (<70 chars), in order to prevent horizontal scrolling (and possible janitor intervention).
  • Want more info? How to link or How to display code and escape characters are good places to start.
Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others drinking their drinks and smoking their pipes about the Monastery: (6)
As of 2024-03-29 15:14 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found