in reply to Print A Sequence with Start codon and different Stop Codon
$sequence =~ /(ATG.*?(?<=TAA|TAG|TGA))/g
to get
ATGGTTTCTCCCATCTCTCCATCGGCATAA ATGA
It still extracts the shortest possible sequence for each starting point (so we lost the second output).
Update: It's possible to get all the sequences without experimental regex features and depending on the return value of print like here.
my @from; my $pos = -1; push @from, $pos while -1 != ($pos = index $sequence, 'ATG', $pos + 1) +; my @to; for my $end (qw( TAA TAG TGA )) { $pos = -1; push @to, $pos + 3 while -1 != ($pos = index $sequence, $end, $pos + + 1); } for my $f (@from) { for my $t (@to) { say substr $sequence, $f, $t - $f if $t > $f; } } __END__ Output: ATGGTTTCTCCCATCTCTCCATCGGCATAA ATGGTTTCTCCCATCTCTCCATCGGCATAAAAATACAGAATGATCTAA ATGGTTTCTCCCATCTCTCCATCGGCATAAAAATACAGAATGA ATGATCTAA ATGA
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re^2: Print A Sequence with Start codon and different Stop Codon
by Anonymous Monk on Oct 27, 2015 at 23:52 UTC | |
by PerlKc (Novice) on Oct 28, 2015 at 01:42 UTC | |
|
Re^2: Print A Sequence with Start codon and different Stop Codon
by PerlKc (Novice) on Oct 28, 2015 at 01:46 UTC |