in reply to Re^12: calculation of charged amino acids
in thread calculation of charged amino acids

I haven't really done anything to it, but it works for me.

Using this fasta file:

>DROTME_HH_Q02936 
MRHIAHTQRCLSRLTSLVALLLIVLPMVFSPAHSCGPGRGLGRHRARNLY
PLVLKQTIPNLSEYTNSASGPLEGVIRRDSPKFKDLVPNYNRDILFRDE
>DROME_HH_Q02937 
MRHIAHTQRCLSRLTSLVALLLIVLPMVFSPAHSCGPGRGLGRHRARNLY
PLVLKQTIPNLSEYTNSASGPLEGVIRRDSPKFKDLVPNYNRDILFRDEE
GTGADRLMSKRCKEKLNVLAYSVMNEWPGIRLLTTTTTTESWDEDYHHGQ
YEGRAVTIATSDRDQSKYGMLARLAVEAGFDWVSYVSRRHIYCSVKSDSS
ESLH
>DROME_HH_Q02938 
GTGADRLMSKRCKEKLNVLAYSVMNEWPGIRLLTTTTTTESWDEDYHHGQ
YEGRAVTIATSDRDQSKYGMLARLAVEAGFDWVSYVSRRHIYCSVKSDSS
ESLH

And this code (essentially the same as you've posted):

#!/usr/bin/perl use strict; use warnings; use autodie; my $header = ''; my (@headers_found, %header_data); open my $fasta_fh, '<', $ARGV[0]; while (<$fasta_fh>) { chomp; next if /^\s*($|#)/; if (/^>(.*)$/) { push @headers_found, ($header = $1); } else { die 'Sequence data found without a header!' unless $he +ader; for (split '', "\U$_") { /(?<a>[BDEZ])|(?<b>[KRH])|(?<ali>[AVLI])|(?<aro +>[FHYW])|(?<po>[DEHKNQRSTZ])|(?<nonpo>[ACFGILMPVWY])/; if ($_ =~ /(?<u>[XUGJOP])/){ next;} ++$header_data{$header}{(keys %+)[0]}; } } } close $fasta_fh; for (@headers_found) { print STDERR "\n\nHeader: $_"; print STDERR "\nAcidic: $header_data{$_}{a}"; print STDERR "\nBasic: $header_data{$_}{b}"; print STDERR "\nAliphatic: $header_data{$_}{ali}"; print STDERR "\nAromatic: $header_data{$_}{aro}"; print STDERR "\nPolar: $header_data{$_}{po}"; print STDERR "\nNonpolar: $header_data{$_}{nonpo}"; print STDERR "\nUnknown: $header_data{$_}{u}" if defined $head +er_data{$_}{u}; }

I get this output:

Header: DROTME_HH_Q02936 
Acidic:  7
Basic:   18
Aliphatic: 31
Aromatic: 6
Polar: 19
Nonpolar: 4

Header: DROME_HH_Q02937 
Acidic:  22
Basic:   35
Aliphatic: 56
Aromatic: 16
Polar: 44
Nonpolar: 9

Header: DROME_HH_Q02938 
Acidic:  14
Basic:   17
Aliphatic: 25
Aromatic: 10
Polar: 25
Nonpolar: 5

If you're not getting that result, then post the error message that you get.

Replies are listed 'Best First'.
Re^14: calculation of charged amino acids
by yuvraj_ghaly (Sexton) on Jul 31, 2013 at 05:14 UTC

    This works fine now. However it is not giving any output file as required.

     perl count_aa.pl <filename> >> <output file>

    this is the command i used in console screen. If i want to get output in a file, it is not working

      this is the command i used in console screen

      Try perl count_aa.pl infile >> outfile

        this is what i did

         perl count_aa.pl M.xanthum_protein.fasta >> output.txt

        But it isn't working

Re^14: calculation of charged amino acids
by yuvraj_ghaly (Sexton) on Jul 31, 2013 at 07:02 UTC

    This is the sample fasta file for which program gives wrong calculation. I have manually counted the residues which does not correspond to program output
    for Aromatic residues:
    A 37
    V 10
    L 12
    I 6
    Total= 65 for polar residues:
    D 12
    E 7
    H 2
    K 12
    N 3
    Q 4
    R 20
    S 7
    T 4
    Z 0
    Total= 71
    for nonpolar residues:
    A 37
    C 3
    F 2
    G 17
    I 6
    L 12
    M 3
    P 13
    V 10
    W 3
    Y 1
    Total= 107

    if you run the program using this sequence you will find the error.

      You didn't post a fasta file, but in any case, the matching is behaving exactly as it's written.

      It won't match all the residues to all of the groups, it will only match each residue to the first group that appears in your regex. If that doesn't make sense, have a look at this.

        Here is the fasta sequence in file

        >gi|44809|emb|CAA29377.1| unnamed protein product [Myxococcus xanthus] MSVDKAFRDMIRNEIEVQLKPLRDVVARLEEGTADLDALRNVAERLAPLAEVVGPLFGAQIPAAAKAGRR +GPGRPPAARSAVTAAPAAVGGKRRGRKPAAAGADGSRACAIIGCGKPSRTKGYCAAHYQKLRMLEKTNR +RPSDWKDYADPDSVDDIKLPRGRAASKALAAAAQAGHAG

      Your posts are becoming silly. You're reintroducing problems you've previously been shown how to fix. Please stop ignoring what people are telling you, please make notes and learn from your mistakes.

      You seem to be relying on others writing code for you. This isn't really a wise strategy. Most people are happy to help you learn and to point out mistakes and suggest improvements, depending on the good will of others to do your job/tasks isn't fair. What will you do when people get tired of doing things for you? Review the links you've previously been given and learn the basics of the tool you have chosen to use.