I have a DNA sequence and I want to find all start codons(ATG,GTG,) and stop codons. I want to translate all possible subsequences betwen start and stop codons to their corresponding protein sequences.This should be on the 1st frame only. For instance; $Dna = "AAAATGGGGTAAGTGAACGGGTAA" should return the corresponding proteins of "ATGGGGTAA" and "GTGAACGGGTAA" but should work also for very long sequences
. I tried to do something like this in the middle of my code:while ($seq =~ m/ATG|TTG|CTG|ATT|CTA|GTG|ATT/gi){ my $matchPosition = pos($seq) - 3; if (($matchPosition % 3) == 0) { push (@startsRF1, $matchPosition); } while ($seq =~ m/TAG|TAA|TGA/gi){ my $matchPosition = pos($seq); if (($matchPosition % 3) == 0) { push (@stopsRF1, $matchPosition); }
But basically; I need to put all possible seubsequences between stats and stops, as above, in an array and then translate each of them
I need help with this.In reply to Orf subsequences by odegbon
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |