in reply to perl mean hydrophobicity protein fasta

You forgot to translate your problem into a perl problem, this means that even if we know how to solve whatever perl problem you're having, it's hard to even understand the question.

Your code doesn't help. You should have more explicit names for your variables (avoid @aa, even if you know it means aminoacids, it kind of looks like you didn't want to give a proper name and couldn't use @a because it was already taken). Avoid names with a single letter, especially $a which is a special variable (so some functions and modules may change its value). And most of all, avoid having several variables with the same name; even if the different types is enough to tell them appart, sigil invariance (the @ and % that sometimes become $) just makes the whole thing confusing.

Since all your data is on a single line, which you do not split while reading the input file, it all gets into $key, and nothing is pushed into the array. So either the sample data you gave us doesn't formatted as in your input file, or there's something wrong with your code.

Here's a little something, from what I guessed you are trying to do:

use strict; use warnings; use Data::Dumper; my %reference; $\ = "\n"; while (<DATA>) { # Fill an array with the information from the reference table $reference{$2} = {Value => $1, Name => $3} if /(\S+)\s+(\S+)\s+(\S+) +/; # Three groups of non space characters separated by blanks } # Show the content of the hash print Dumper \%reference; # quotemeta adds a \ in front of special chars (here *) so that they l +ose their special meaning in the regex # map applies the expression on all the elements in the list, so here +this is a quotemeta applied on all the keys my $pattern = join "|", map quotemeta, keys %reference; my $sequence = 'ABBAPERL**'; # With this method, BB and ** become a single element, not two as in s +plit // my @acids = $sequence =~ /($pattern)/g; print "Splitted sequence is: @acids\n"; my %count; my $sum = 0; for my $acid (@acids) { # Translate the name with the reference table print "Found $reference{$acid}{Name}"; $count{$acid}++; $sum += $reference{$acid}{Value}; } print "Sum: $sum\n"; # Bonus: a quick way to translate all the acids into their longer name +, using map to apply the translation on the whole list print join " ", map $reference{$_}{Name}, @acids; __DATA__ 1.800 A Ala -3.500 BB Asx 2.500 C Cys Note: Columns 1-8 must contain 1 numeric value only -3.500 D Asp -3.500 E Glu Note: This file is required for amphpathic helic 2.800 F Phe -0.400 G Gly -3.200 H His 4.500 I Ile -3.900 K Lys 3.800 L Leu 1.900 M Met -3.500 N Asn -1.600 P Pro -3.500 Q Gln -4.500 R Arg -0.800 S Ser -0.700 T Thr 4.200 V Val -0.900 W Trp -0.490 X- Unk -1.300 Y Tyr -3.500 ZZ Glx -0.490 ** ***
My variable names may not be that good, but you understand what everything means better than me.

Replies are listed 'Best First'.
Re^2: perl mean hydrophobicity protein fasta
by Megiddo (Initiate) on Sep 09, 2016 at 17:13 UTC
    Thank you. It helps. I think I`ll get to the bottom of this. Sorry for not being clear.