$sequence =~ /(ATG.*?(?<=TAA|TAG|TGA))/g
to get
ATGGTTTCTCCCATCTCTCCATCGGCATAA ATGA
It still extracts the shortest possible sequence for each starting point (so we lost the second output).
Update: It's possible to get all the sequences without experimental regex features and depending on the return value of print like here.
my @from; my $pos = -1; push @from, $pos while -1 != ($pos = index $sequence, 'ATG', $pos + 1) +; my @to; for my $end (qw( TAA TAG TGA )) { $pos = -1; push @to, $pos + 3 while -1 != ($pos = index $sequence, $end, $pos + + 1); } for my $f (@from) { for my $t (@to) { say substr $sequence, $f, $t - $f if $t > $f; } } __END__ Output: ATGGTTTCTCCCATCTCTCCATCGGCATAA ATGGTTTCTCCCATCTCTCCATCGGCATAAAAATACAGAATGATCTAA ATGGTTTCTCCCATCTCTCCATCGGCATAAAAATACAGAATGA ATGATCTAA ATGA
In reply to Re: Print A Sequence with Start codon and different Stop Codon
by choroba
in thread Print A Sequence with Start codon and different Stop Codon
by PerlKc
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |