Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

Hello all, I have a school assignment in Biology but I seem to be unable to begin... What I was given is something like the following : This is my input file:
>2BL2A H1-H2 MMDYLITQV GMVFAVLAMATATIFSGIGSAKGVGMTGEAAAALTT M--F #(aminoacids that interact) L--V H1-H6 MMDYLITQV SVVQGLNFLGASLPIAFTGLFSGIAQGKVAAAGIQILAKK M--V L--V I--V M--V
The output file i want to be sth like that
>2BL2A H1 MMDYLITQV H2 GMVFAVLAMATATIFSGIGSAKGVGMTGEAAAALTT D=1 I=1 L=1 M=2 Q=1 T=1 Y=1 V=1 The total aa num =9 A=9 E=1 F=2 G=6 I=2 K=1 L=2 M=3 S=2 T=5 V=3 The total aa + num =36 MF 1-->4 ## 1 is the number of how many times the aminoacids interact + and 4 is the number of how many times there is a possibility LV 1-->5 ## these two aminoacids(MF) to interact. M=2 from H1 and F= +2 from H2 2x2=4 ## L=1 from H1, V=3 from H2 1x3=3 ## L=2 form H2, V=1 from H1 2x1=2 the sum is 5 H1 MMDYLITQV H6 SVVQGLNFLGASLPIAFTGLFSGIAQGKVAAAGIQILAKK D=1 I=1 L=1 M=2 Q=1 T=1 Y=1 V=1 The total aa num =9 A=7 F=3 G=6 I=4 K=3 L=5 N=1 P=1 Q=3 S=3 T=1 V=3 The tot +al aa num =40 MV 2-->6 ##M=2 from H1 V=3 from H6 2x3=6 LV 1-->8 ## L=1 from H1 V=3 from H6 1x3=3 L=5 from H6 V=1 from H1 +5x1=5 3+5=8 possible interactions IV 1-->7
Any help please?

Replies are listed 'Best First'.
Re: confused with interactions...
by davido (Cardinal) on Jul 04, 2007 at 17:37 UTC

    We can explain the file parsing, if you can first explain the actual "rules" set. For example, I have no idea when amino acids interact, and when there is a possibility. Those of us who don't have a background in your subject would be better able to assist if we understand the basics of the puzzle.


    Dave

Re: confused with interactions...
by GrandFather (Saint) on Jul 04, 2007 at 21:10 UTC

    Some tools that will help:

    • using split as: my @H1aAcids = split //, $H1 you get an array containing the individual letters (amino acids I presume)
    • using a hash (see perldata) you can count the number of occurrences of each letter: $H1count{$_}++ for @H1aAcids

    If you want any more help you're going to have to tell us in terms of a recipe (algorithm) how the output relates to the input or wait for one of the biologically knowledgeable monks to stroll by.


    DWIM is Perl's answer to Gödel