Also, I note a reference to codons, which implies that your tests should be considering a stride of 3 rather than an arbitrary position.
This is an excellent point. For the benefit of the OP, here is one way to ensure that only codon-sequences are captured:
#! perl use strict; use warnings; my $seq = 'AATGGTTTCTCCCATCTCTCCATCGGCATAAAAATACAGAATGATCTAACGAA'; # Adapted from the regex by stevieb my $re = qr{ ( # capture each sequence: ATG # - which begins with the codon ATG (?: [ACGT]{3} )*? # - followed by the smallest number of + codons (?: TAG | TAA | TGA ) # - and ending with the codon TAG, TAA +, or TGA ) }x; print "$1\n" while $seq =~ /$re/g;
(This assumes that only minimal sequences are wanted — an assumption which should be clarified, as Laurent_R has pointed out, above.)
Hope that helps,
| Athanasius <°(((>< contra mundum | Iustus alius egestas vitae, eros Piratica, |
In reply to Re^2: Regular expressions
by Athanasius
in thread Regular expressions
by lairel
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |