in reply to Re: Mean calculation problem
in thread Mean calculation problem

Thanks for spending time to write the code, helped a lot. But is there a way for me to apply this to a big chunk of data ? I mean, the aminoacid groups go as long as 1,2,...............264 and there can be 2600something ATOM lines depending on the PDB file.

Replies are listed 'Best First'.
Re^3: Mean calculation problem
by BrowserUk (Patriarch) on Jan 13, 2010 at 13:43 UTC

    Sure. If you supply this version:

    #! perl -slw use strict; use List::Util qw[ sum ]; BEGIN{ @ARGV = map glob( $_ ), @ARGV } my %stats; while( <> ) { next unless m[^ATOM]; my @fields = unpack 'a6 a5 a3 a3 a4 x2 a3 x4 a8 a8 a7 a7 a6 a12', +$_; # print join'|', @fields; push @{ $stats{ $fields[ 4 ] } }, $fields[ 10 ]; } printf "%3s : %.3f%%\n", $_, sum( @{ $stats{ $_ } } ) / @{ $stats{ $_ } } for sort keys %stats;

    with a wildcard path/filename on the command line,

    theScript.pl /path/to/*.pdb

    it will perform the stats across all the matching files.

    Alternatively, if you want the stats on a file-by-file basis, then supply the filenames one at a time:

    ## For windows; something similar is possible on *nix for %i in (\path\to\*.pdb) do @theScript.pl \path\to\%i

    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority".
    In the absence of evidence, opinion is indistinguishable from prejudice.