in reply to Understanding a portion on the Perlretut

perlretut also has prose text to go with the code. This also motivates why it uses (\w\w\w)*?, namely to progress through the string in triplets instead of trying to match at each character position.

Replies are listed 'Best First'.
Re^2: Understanding a portion on the Perlretut
by BlueStarry (Novice) on Dec 09, 2015 at 13:26 UTC
    There is no such sentence on the explanation.

      I linked to perlretut. Going there, I find:

      The naive regexp

      ...

      doesn't work; it may match a TGA , but there is no guarantee that the match is aligned with codon boundaries, e.g., the substring GTT GAA gives a match. A better solution is

      while ($dna =~ /(\w\w\w)*?TGA/g) { # note the minimal *? print "Got a TGA stop codon at position ", pos $dna, "\n"; }
      which prints
      Got a TGA stop codon at position 18 Got a TGA stop codon at position 23
      Position 18 is good, but position 23 is bogus. What happened?

      Maybe it was too obvious for me, but a Codon is a nucleotide triplet.