Counting amino acids

yuvraj_ghaly has asked for the wisdom of the Perl Monks concerning the following question:

Replies are listed 'Best First'.
Re: Counting amino acids by davido (Cardinal) on Jul 19, 2013 at 04:48 UTC
That's a product spec. What have you written so far, that we might help you with? Dave	[reply]
Re^2: Counting amino acids by yuvraj_ghaly (Sexton) on Jul 19, 2013 at 06:46 UTC
#! /usr/bin/perl -w use strict; open (S, "$ARGV[0]") \|\| die "cannot open FASTA file to read: $!"; my %s;# a hash of arrays, to hold each line of sequence my %seq; #a hash to hold the AA sequences. my $key; while (<S>){ #Read the FASTA file. chomp; if (/>/){ s/>//; $key= $_; }else{ push (@{$s{$key}}, $_); } } foreach my $a (keys %s){ my $s= join("", @{$s{$a}}); $seq{$a}=$s; #print("$a\t$s\n"); } my @aa= qw(A R N D C Q E G H I L K M F P S T W Y V); my $aa= join("\t", @aa); print ("Sequence\t$aa\n"); foreach my $k (keys %seq){ my %count; # a hash to hold the count for each amino acid in the p +rotein. my @seq= split(//, $seq{$k}); foreach my $r(@seq){ $count{$r}++; } my @row; push(@row, $k); foreach my $a (@aa){ $count{$a}\|\|=0; $count{$a}= sprintf("%0.1f",($count{$a}/length($seq{$k}))*100) +; push(@row,$count{$a}); } my $row= join("\t",@row); print("\n$row\n"); } [download] this is the code but it is giving percentage output but i need in decimal	[reply] [d/l]
Re^3: Counting amino acids by roboticus (Chancellor) on Jul 19, 2013 at 11:20 UTC
yuvraj_ghaly: It looks like your problem is on line 42. Just stop converting the values to percentages, convert them instead to the format you want. ...roboticus When your only tool is a hammer, all problems look like your thumb.	[reply]
Re^4: Counting amino acids by MidLifeXis (Monsignor) on Jul 19, 2013 at 13:03 UTC
Re: Counting amino acids by space_monk (Chaplain) on Jul 19, 2013 at 05:53 UTC
As the previous poster has said, you have not given us much to go on, but if you are just starting out and have nothing so far, here's a plan of attack for the problem One way to do it is to write your own routine which would read FASTA format and then simply greps or counts the number of amino acids. This is not the recommended way however! A better way is to look on CPAN and see if someone has done some or all of the work for you.Bio::DB:Fasta and Bio::SeqIO, FastaParse and Bio::Phylo::IO are all starting points for reading FASTA files and parsing FASTA data. If you spot any bugs in my solutions, it's because I've deliberately left them in as an exercise for the reader! :-)	[reply]
Re: Counting amino acids by mtmcc (Hermit) on Jul 19, 2013 at 07:03 UTC
Ah, you want to modify this script? If you don't want percentages, you can just modify the last $count line: from `$count{$a}= sprintf("%0.1f",($count{$a}/length($seq{$k}))*100);` [download] to this: `$count{$a}= sprintf("%0.6f",($count{$a}/length($seq{$k})));` [download]	[reply] [d/l] [select]
Re^2: Counting amino acids by yuvraj_ghaly (Sexton) on Jul 19, 2013 at 07:14 UTC
see the problem is I don't want any decimal. This code do division calculation between amino acid counted to overall length of sequence. I just require the amino acid counted.	[reply]
Re^3: Counting amino acids by mtmcc (Hermit) on Jul 19, 2013 at 07:19 UTC
Well then replace the line I mentioned above with: `count{$a};`	[reply] [d/l]
Re^4: Counting amino acids by yuvraj_ghaly (Sexton) on Jul 19, 2013 at 07:24 UTC