in reply to Re^14: calculation of charged amino acids
in thread calculation of charged amino acids

You didn't post a fasta file, but in any case, the matching is behaving exactly as it's written.

It won't match all the residues to all of the groups, it will only match each residue to the first group that appears in your regex. If that doesn't make sense, have a look at this.

  • Comment on Re^15: calculation of charged amino acids

Replies are listed 'Best First'.
Re^16: calculation of charged amino acids
by yuvraj_ghaly (Sexton) on Aug 01, 2013 at 04:22 UTC

    Here is the fasta sequence in file

    >gi|44809|emb|CAA29377.1| unnamed protein product [Myxococcus xanthus] MSVDKAFRDMIRNEIEVQLKPLRDVVARLEEGTADLDALRNVAERLAPLAEVVGPLFGAQIPAAAKAGRR +GPGRPPAARSAVTAAPAAVGGKRRGRKPAAAGADGSRACAIIGCGKPSRTKGYCAAHYQKLRMLEKTNR +RPSDWKDYADPDSVDDIKLPRGRAASKALAAAAQAGHAG
      #!/usr/bin/perl use strict; use warnings; use autodie; my $a; my $b; my $ali; my $aro; my $po; my $nonpo; my $sum = 0; my $u = 0; my $tag = 0; my @line; my $header = ''; my (@headers_found, %header_data); open (my $fasta_fh, '<', $ARGV[0]); while (<$fasta_fh>) { chomp; if ($_ !~ /^>(.*)$/) { @line = split('',$_); for (@line) { $a += 1 if $_ =~ /[BDEZ]/; $b += 1 if $_ =~ /[KRH]/; $ali += 1 if $_ =~ /[AVLI]/; $aro += 1 if $_ =~ /[FHYW]/; $po += 1 if $_ =~ /[DEHKNQRSTZ]/; $nonpo += 1 if $_ =~ /[ACFGILMPVWY]/; $u += 1 if $_ =~ /[XUGJOP]/; $sum += 1; } } if ((/^>(.*)$/) || (eof)) { if ($sum > 0) { $header_data{$tag}{'a'} = $a; $header_data{$tag}{'b'} = $b; $header_data{$tag}{'ali'} = $ali; $header_data{$tag}{'aro'} = $aro; $header_data{$tag}{'po'} = $po; $header_data{$tag}{'nonpo'} = $nonpo; $header_data{$tag}{'u'} = $u; $a = 0; $b = 0; $ali = 0; $aro = 0; $po = 0; $ +u = 0; $nonpo = 0; $sum = 0; } $tag = $_ unless eof; push (@headers_found, $tag) unless (eof); } } close $fasta_fh; for (@headers_found) { print STDOUT "\nHeader: $_\n"; print STDOUT "Acidic: $header_data{$_}{'a'}\n" if $header_dat +a{$_}{'a'} > 0;; print STDOUT "Basic: $header_data{$_}{'b'}\n" if $header_dat +a{$_}{'b'} > 0;; print STDOUT "Aliphatic: $header_data{$_}{'ali'}\n" if $header +_data{$_}{'ali'} > 0;; print STDOUT "Aromatic: $header_data{$_}{'aro'}\n" if $header_ +data{$_}{'aro'} > 0;; print STDOUT "Polar: $header_data{$_}{'po'}\n" if $header_data +{$_}{'po'} > 0;; print STDOUT "Nonpolar: $header_data{$_}{'nonpo'}\n" if $heade +r_data{$_}{'nonpo'} > 0;; print STDOUT "Unknown: $header_data{$_}{'u'}\n" if $header_dat +a{$_}{'u'} > 0; }