in reply to Re^10: calculation of charged amino acids
in thread calculation of charged amino acids

Dude.

I'm not a code writing service.

Please read How do I post a question effectively?

This is very likely the last update I'll ever write for this. Ever.

#!/usr/bin/perl -w use strict; use warnings; my $file = $ARGV[0]; my @currentProtein; my %protein; open (FASTA, "<", $file) || die "Can't open $file\n"; my $fastaLine; my $protSequence; my @fastaTag; my @tagOrder; while (<FASTA>) { print STDERR "CURRENT: $_"; chomp; if (($_ !~ /^>/) && ($_ =~ /\w/)) { push (@currentProtein, $_); } if ((/^>/) || (eof)) { if ((@currentProtein > 0) || (eof)) { $protSequence = join("", @currentProtein); $protSequence =~ s/ //g; @fastaTag = split(" ", $fastaLine); $protein{$fastaTag[0]} = $protSequence; push (@tagOrder, $fastaTag[0]); @currentProtein = (); } $fastaLine = $_ if $_ =~ /\w/; } } close FASTA; for(@tagOrder) { my $count_of_acidic = 0; my $count_of_basic = 0; my $count_of_neutral = 0; my $aa; my $sequence = "$protein{$_}"; $sequence =~ s/\s//g; my @prot=split("",$sequence); #splits string into an array #print " \nThe original PROTEIN file is:\n$sequence \n"; while(@prot) { $aa = shift (@prot); if($aa =~/[DNEQ]/ig) { $count_of_acidic++; } if($aa=~/[KRH]/ig) { $count_of_basic++; } if($aa=~/[DNEQKRH]/ig) { $count_of_neutral++; } } print "\nName: $_\n"; print "Number of acidic amino acids:".$count_of_acidic."\n"; print "Number of basic amino acids:".$count_of_basic."\n"; print "Number of neutral amino acids:".$count_of_neutral."\n"; }

Replies are listed 'Best First'.
Re^12: calculation of charged amino acids
by yuvraj_ghaly (Sexton) on Jul 25, 2013 at 05:31 UTC

    thankz mate

Re^12: calculation of charged amino acids
by yuvraj_ghaly (Sexton) on Jul 30, 2013 at 06:23 UTC

    modification of this code is done here but error is showing only in counting "aromatic residue"

    any suggestions

    #!/usr/bin/env perl use 5.010; use strict; use warnings; use autodie; my $header = ''; my (@headers_found, %header_data); open my $fasta_fh, '<', $ARGV[0]; while (<$fasta_fh>) { chomp; next if /^\s*($|#)/; if (/^>(.*)$/) { push @headers_found, ($header = $1); } else { die 'Sequence data found without a header!' unless $header; for (split '', "\U$_") { /(?<a>[BDEZ])|(?<b>[KRH])|(?<ali>[AVLI])|(?<aro>[FHYW])|(?< +po>[DEHKNQRSTZ])|(?<nonpo>[ACFGILMPVWY])/; if ($_ =~ /(?<u>[XUGJOP])/){ next;} ++$header_data{$header}{(keys %+)[0]}; } } } close $fasta_fh; for (@headers_found) { say "Header: $_"; say "\tAcidic: $header_data{$_}{a}"; say "\tBasic: $header_data{$_}{b}"; say "\tAliphatic: $header_data{$_}{ali}"; say "\tAromatic: $header_data{$_}{aro}"; say "\tPolar: $header_data{$_}{po}"; say "\tNonpolar: $header_data{$_}{nonpo}"; # say "\tUnknown: $header_data{$_}{u}"; }
      I haven't really done anything to it, but it works for me.

      Using this fasta file:

      >DROTME_HH_Q02936 
      MRHIAHTQRCLSRLTSLVALLLIVLPMVFSPAHSCGPGRGLGRHRARNLY
      PLVLKQTIPNLSEYTNSASGPLEGVIRRDSPKFKDLVPNYNRDILFRDE
      >DROME_HH_Q02937 
      MRHIAHTQRCLSRLTSLVALLLIVLPMVFSPAHSCGPGRGLGRHRARNLY
      PLVLKQTIPNLSEYTNSASGPLEGVIRRDSPKFKDLVPNYNRDILFRDEE
      GTGADRLMSKRCKEKLNVLAYSVMNEWPGIRLLTTTTTTESWDEDYHHGQ
      YEGRAVTIATSDRDQSKYGMLARLAVEAGFDWVSYVSRRHIYCSVKSDSS
      ESLH
      >DROME_HH_Q02938 
      GTGADRLMSKRCKEKLNVLAYSVMNEWPGIRLLTTTTTTESWDEDYHHGQ
      YEGRAVTIATSDRDQSKYGMLARLAVEAGFDWVSYVSRRHIYCSVKSDSS
      ESLH
      

      And this code (essentially the same as you've posted):

      #!/usr/bin/perl use strict; use warnings; use autodie; my $header = ''; my (@headers_found, %header_data); open my $fasta_fh, '<', $ARGV[0]; while (<$fasta_fh>) { chomp; next if /^\s*($|#)/; if (/^>(.*)$/) { push @headers_found, ($header = $1); } else { die 'Sequence data found without a header!' unless $he +ader; for (split '', "\U$_") { /(?<a>[BDEZ])|(?<b>[KRH])|(?<ali>[AVLI])|(?<aro +>[FHYW])|(?<po>[DEHKNQRSTZ])|(?<nonpo>[ACFGILMPVWY])/; if ($_ =~ /(?<u>[XUGJOP])/){ next;} ++$header_data{$header}{(keys %+)[0]}; } } } close $fasta_fh; for (@headers_found) { print STDERR "\n\nHeader: $_"; print STDERR "\nAcidic: $header_data{$_}{a}"; print STDERR "\nBasic: $header_data{$_}{b}"; print STDERR "\nAliphatic: $header_data{$_}{ali}"; print STDERR "\nAromatic: $header_data{$_}{aro}"; print STDERR "\nPolar: $header_data{$_}{po}"; print STDERR "\nNonpolar: $header_data{$_}{nonpo}"; print STDERR "\nUnknown: $header_data{$_}{u}" if defined $head +er_data{$_}{u}; }

      I get this output:

      Header: DROTME_HH_Q02936 
      Acidic:  7
      Basic:   18
      Aliphatic: 31
      Aromatic: 6
      Polar: 19
      Nonpolar: 4
      
      Header: DROME_HH_Q02937 
      Acidic:  22
      Basic:   35
      Aliphatic: 56
      Aromatic: 16
      Polar: 44
      Nonpolar: 9
      
      Header: DROME_HH_Q02938 
      Acidic:  14
      Basic:   17
      Aliphatic: 25
      Aromatic: 10
      Polar: 25
      Nonpolar: 5
      

      If you're not getting that result, then post the error message that you get.

        This works fine now. However it is not giving any output file as required.

         perl count_aa.pl <filename> >> <output file>

        this is the command i used in console screen. If i want to get output in a file, it is not working

        This is the sample fasta file for which program gives wrong calculation. I have manually counted the residues which does not correspond to program output
        for Aromatic residues:
        A 37
        V 10
        L 12
        I 6
        Total= 65 for polar residues:
        D 12
        E 7
        H 2
        K 12
        N 3
        Q 4
        R 20
        S 7
        T 4
        Z 0
        Total= 71
        for nonpolar residues:
        A 37
        C 3
        F 2
        G 17
        I 6
        L 12
        M 3
        P 13
        V 10
        W 3
        Y 1
        Total= 107

        if you run the program using this sequence you will find the error.