reading RNA codons into hash

ic23oluk has asked for the wisdom of the Perl Monks concerning the following question:

hello monks,

I try to read the RNA triplet code into a hash, where the triplets are keys, and the one letter code of amino acids are the values. I read the information from an txt file that has the following structure: :

X whitespace codon_1 codon_2 ...

. . .

X stands for the Letter of the amino acid

Here's my code

my %code;
my $file = 'code.txt';
open (READ, $file) || die "Cannot open $file: $!\n";

while (my $line = <READ>){
    chomp $line;
    if ($line =~ /^(\w+)\s+([\w]+)$/i){
        (%code) = (
        "$2" => $1
        );
        next;
    }
    if ($line =~ /^(\w+)\s+([\w]+)\s+([\w]+)$/i){
        (%code) = (
        "$2" => $1, "$3" => $1
        );
        next;
    }
    if ($line =~ /^(\w+)\s+([\w]+)\s+([\w]+)\s+([\w]+)$/i){
        (%code) = (
        "$2" => $1, "$3" => $1, "$4" => $1
        );
        next;
    }
    if ($line =~ /^(\w+)\s+([\w]+)\s+([\w]+)\s+([\w]+)\s+([\w]+)$/i){
        (%code) = (
        "$2" => $1, "$3" => $1, "$4" => $1, "$5" => $1
        );
        next;
    }
    if ($line =~ /^(\w+)\s+([\w]+)\s+([\w]+)\s+([\w]+)\s+([\w]+)\s+([\
+w]+)$/i){
        (%code) = (
        "$2" => $1, "$3" => $1, "$4" => $1, "$5" => $1, "$6" => $1
        );
        next;
    }
    if ($line =~ /^(\w+)\s+([\w]+)\s+([\w]+)\s+([\w]+)\s+([\w]+)\s+([\
+w]+)\s+([\w]+)$/i){
        (%code) = (
        "$2" => $1, "$3" => $1, "$4" => $1, "$5" => $1, "$6" => $1, "$
+7" => $1
        );
        next;
    }
}

foreach (keys %code){
    print $_, "\t", $code{"$_"}, "\n";
}
[download]

the output i get is just the last line (UAA, UAG, UGA Stop). Could anyone indicate the problem?

thanks in advance

Comment on reading RNA codons into hash Download Code

Replies are listed 'Best First'.
Re: reading RNA codons into hash by choroba (Cardinal) on Jul 13, 2017 at 09:48 UTC
By assigning to the whole hash, you're overwriting it for each line. `%hash = (key => 'value'); # Removes previous contents of %hash.` [download] Assign just to the value corresponding to a key: `$hash{key} = 'value';` [download] In your case, it's `$code{$2} = $1; $code{$3} = $1; ...` [download] Update: Also, `[\w]` can be written as just `\w` . Using `/i` only make sense if your regex contains letters, yours only contains `\w` and '\s'. Moreover, you can probably simplify the whole program (I'm just guessing as you haven't provided any sample data) to `while (my $line = <READ>) { my ($acid, @codons) = split ' ', $line; $code{$_} = $acid for @codons; }` [download] ($q=q:Sq=~/;[c](.)(.)/;chr(-\|\|-\|5+lengthSq)`"S\|oS2"`map{chr \|+ord }map{substrSq`S_+\|`\|}3E\|-\|`7**2-3:)=~y+S\|`+$1,++print+eval$q,q,a, [download]	[reply] [d/l] [select]
Re^2: reading RNA codons into hash by ic23oluk (Sexton) on Jul 13, 2017 at 10:12 UTC
Thank you very much!	[reply]
Re: reading RNA codons into hash by QM (Parson) on Jul 13, 2017 at 10:00 UTC
Not sure what your problem is, but that's way to much code :D I think you want something like this: `#!/usr/bin/env perl use strict; use warnings; our %code; while (<>){ chomp; my @tokens = split " ", $_; my $protein = shift @tokens; for my $token (@tokens) { $code{$token} = $protein; } } foreach my $key (sort keys %code){ print "$key\t$code{$key}\n"; }` [download] Using this input file: `A ABC DEF GHI JKL B QRS TUV WXY C EYE LUV YOU` [download] I get this result: `ABC A DEF A EYE C GHI A JKL A LUV C QRS B TUV B WXY B YOU C` [download] -QM -- Quantum Mechanics: The dreams stuff is made of	[reply] [d/l] [select]
Re^2: reading RNA codons into hash by ic23oluk (Sexton) on Jul 13, 2017 at 10:13 UTC
Thank you, too!	[reply]
Re: reading RNA codons into hash by AnomalousMonk (Archbishop) on Jul 13, 2017 at 10:35 UTC
This is the sort of thing I think should be encapsulated in a module rather than in a funky old `.txt` file you have to parse every time you use it. Here's an example I put together in another context. It's directed to DNA rather than RNA (I think ... I'm not a BioMonk), but you should be able to get the general idea. File: `CodonToAmino.pm`: Read more... (2 kB) Give a man a fish: `<%-{-{-{-<`	[reply] [d/l] [select]
Re^2: reading RNA codons into hash by Anonymous Monk on Jul 13, 2017 at 13:54 UTC
Strange but true: there isn't just one genetic code, there are several depending on the organism; see list of genetic codes. There are also some additional weird variants due to RNA editing that happens in certain situations.	[reply]