in reply to Re^13: calculation of charged amino acids
in thread calculation of charged amino acids

This is the sample fasta file for which program gives wrong calculation. I have manually counted the residues which does not correspond to program output
for Aromatic residues:
A 37
V 10
L 12
I 6
Total= 65 for polar residues:
D 12
E 7
H 2
K 12
N 3
Q 4
R 20
S 7
T 4
Z 0
Total= 71
for nonpolar residues:
A 37
C 3
F 2
G 17
I 6
L 12
M 3
P 13
V 10
W 3
Y 1
Total= 107

if you run the program using this sequence you will find the error.

  • Comment on Re^14: calculation of charged amino acids

Replies are listed 'Best First'.
Re^15: calculation of charged amino acids
by mtmcc (Hermit) on Jul 31, 2013 at 09:41 UTC
    You didn't post a fasta file, but in any case, the matching is behaving exactly as it's written.

    It won't match all the residues to all of the groups, it will only match each residue to the first group that appears in your regex. If that doesn't make sense, have a look at this.

      Here is the fasta sequence in file

      >gi|44809|emb|CAA29377.1| unnamed protein product [Myxococcus xanthus] MSVDKAFRDMIRNEIEVQLKPLRDVVARLEEGTADLDALRNVAERLAPLAEVVGPLFGAQIPAAAKAGRR +GPGRPPAARSAVTAAPAAVGGKRRGRKPAAAGADGSRACAIIGCGKPSRTKGYCAAHYQKLRMLEKTNR +RPSDWKDYADPDSVDDIKLPRGRAASKALAAAAQAGHAG
        #!/usr/bin/perl use strict; use warnings; use autodie; my $a; my $b; my $ali; my $aro; my $po; my $nonpo; my $sum = 0; my $u = 0; my $tag = 0; my @line; my $header = ''; my (@headers_found, %header_data); open (my $fasta_fh, '<', $ARGV[0]); while (<$fasta_fh>) { chomp; if ($_ !~ /^>(.*)$/) { @line = split('',$_); for (@line) { $a += 1 if $_ =~ /[BDEZ]/; $b += 1 if $_ =~ /[KRH]/; $ali += 1 if $_ =~ /[AVLI]/; $aro += 1 if $_ =~ /[FHYW]/; $po += 1 if $_ =~ /[DEHKNQRSTZ]/; $nonpo += 1 if $_ =~ /[ACFGILMPVWY]/; $u += 1 if $_ =~ /[XUGJOP]/; $sum += 1; } } if ((/^>(.*)$/) || (eof)) { if ($sum > 0) { $header_data{$tag}{'a'} = $a; $header_data{$tag}{'b'} = $b; $header_data{$tag}{'ali'} = $ali; $header_data{$tag}{'aro'} = $aro; $header_data{$tag}{'po'} = $po; $header_data{$tag}{'nonpo'} = $nonpo; $header_data{$tag}{'u'} = $u; $a = 0; $b = 0; $ali = 0; $aro = 0; $po = 0; $ +u = 0; $nonpo = 0; $sum = 0; } $tag = $_ unless eof; push (@headers_found, $tag) unless (eof); } } close $fasta_fh; for (@headers_found) { print STDOUT "\nHeader: $_\n"; print STDOUT "Acidic: $header_data{$_}{'a'}\n" if $header_dat +a{$_}{'a'} > 0;; print STDOUT "Basic: $header_data{$_}{'b'}\n" if $header_dat +a{$_}{'b'} > 0;; print STDOUT "Aliphatic: $header_data{$_}{'ali'}\n" if $header +_data{$_}{'ali'} > 0;; print STDOUT "Aromatic: $header_data{$_}{'aro'}\n" if $header_ +data{$_}{'aro'} > 0;; print STDOUT "Polar: $header_data{$_}{'po'}\n" if $header_data +{$_}{'po'} > 0;; print STDOUT "Nonpolar: $header_data{$_}{'nonpo'}\n" if $heade +r_data{$_}{'nonpo'} > 0;; print STDOUT "Unknown: $header_data{$_}{'u'}\n" if $header_dat +a{$_}{'u'} > 0; }
Re^15: calculation of charged amino acids
by marto (Cardinal) on Jul 31, 2013 at 09:51 UTC

    Your posts are becoming silly. You're reintroducing problems you've previously been shown how to fix. Please stop ignoring what people are telling you, please make notes and learn from your mistakes.

    You seem to be relying on others writing code for you. This isn't really a wise strategy. Most people are happy to help you learn and to point out mistakes and suggest improvements, depending on the good will of others to do your job/tasks isn't fair. What will you do when people get tired of doing things for you? Review the links you've previously been given and learn the basics of the tool you have chosen to use.