Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

Hi Monks.

I am unable to determine why the following *experimental* code is only appending the one letter amino acid to the end of the initial DNA string as opposed to displaying only the amino acid in question. The program should display only an 'L' and not 'TTAL'. Note: This is not a homework assignment.

use strict; use warnings; use feature qw(postderef); no warnings qw(experimental::postderef); use feature qw(say); my $dna = 'TTA'; my @proteinString; my $codon; for(my $i=0; $i < (length($dna) - 2) ; $i += 3) { $codon = substr($dna,$i,3); push @proteinString, $codon .= codonToAminoAcid($codon); } my $proteinString = \@proteinString; for my $element ( $proteinString->@* ) { say $element; } sub codonToAminoAcid { my($codon) = @_; if ( $codon =~ /GC./i) { return 'A' } elsif ( $codon =~ /TG[TC]/i) { return 'C' } elsif ( $codon =~ /GA[TC]/i) { return 'D' } elsif ( $codon =~ /GA[AG]/i) { return 'E' } elsif ( $codon =~ /TT[TC]/i) { return 'F' } elsif ( $codon =~ /GG./i) { return 'G' } elsif ( $codon =~ /CA[TC]/i) { return 'H' } elsif ( $codon =~ /AT[TCA]/i) { return 'I' } elsif ( $codon =~ /AA[AG]/i) { return 'K' } elsif ( $codon =~ /TT[AG]|CT./i) { return 'L' } elsif ( $codon =~ /ATG/i) { return 'M' } elsif ( $codon =~ /AA[TC]/i) { return 'N' } elsif ( $codon =~ /CC./i) { return 'P' } elsif ( $codon =~ /CA[AG]/i) { return 'Q' } elsif ( $codon =~ /CG.|AG[AG]/i) { return 'R' } elsif ( $codon =~ /TC.|AG[TC]/i) { return 'S' } elsif ( $codon =~ /AC./i) { return 'T' } elsif ( $codon =~ /GT./i) { return 'V' } elsif ( $codon =~ /TGG/i) { return 'W' } elsif ( $codon =~ /TA[TC]/i) { return 'Y' } elsif ( $codon =~ /TA[AG]|TGA/i) { return '_' } else { print "Unrecognized codon: \"$codon\"!\n"; exit; } }


Thanks.

Replies are listed 'Best First'.
Re: Displayng the correct codon in a DNA string.
by Your Mother (Archbishop) on Apr 25, 2016 at 02:45 UTC

    Another approach. I recommend leaving off the experimental dereferencing stuff but to each her own and it’s your cup of hemlock. :P

    #!/usr/bin/env perl use 5.01; use strict; use warnings; my $dna = shift || "TTATATAAAAGU"; my $matcher = make_matcher(); my @protein; while ( $dna =~ /\G(?<codon>...)/g ) # Or maybe [ACTGU]{3} + /i { push @protein, $matcher->($+{codon}); } say $_ || "DERP!" for @protein; exit; sub make_matcher { my %map = ( qr/GC./ => "A", qr/TG[TC]/ => "C", qr/GA[TC]/ => "D", qr/GA[AG]/ => "E", qr/TT[TC]/ => "F", qr/GG./ => "G", qr/CA[TC]/ => "H", qr/AT[TCA]/ => "I", qr/AA[AG]/ => "K", qr/TT[AG]|CT./ => "L", qr/ATG/ => "M", qr/AA[TC]/ => "N", qr/CC./ => "P", qr/CA[AG]/ => "Q", qr/CG.|AG[AG]/ => "R", qr/TC.|AG[TC]/ => "S", qr/AC./ => "T", qr/GT./ => "V", qr/TGG/ => "W", qr/TA[TC]/ => "Y", qr/TA[AG]|TGA/ => "_" ); sub { my $codon = uc +shift; for my $key ( keys %map ) { return $map{$key} if $codon =~ $key; } }; }
      :) if order is important, use arrray of arrays instead of hash

        Yes. Good call out.

      my @protein = map $matcher->($_), $dna =~ /.../g;

        I like the cut of your jib. :P

Re: Displayng the correct codon in a DNA string.
by Anonymous Monk on Apr 25, 2016 at 02:13 UTC
    push @proteinString, codonToAminoAcid($codon);

      Specifically, because the OP's push @proteinString, $codon .= codonToAminoAcid($codon); will append the 'L' to 'TTA', then push the entire string to the array, whereas AM's suggested push @proteinString, codonToAminoAcid($codon); just pushes the 'L' into the array.

        Thank you all! I certainly appreciate the insightful and beneficial feedback.