Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

Hello all, I have a school assignment in Biology but I seem to be unable to begin... What I was given is something like the following :
HELIX1: x y z A 1 -1.115 8.537 7.075 B 2 -2.745 5.280 7.165 C 3 -0.777 3.267 7.329 D 4 1.610 5.225 10.885 E 5 0.296 5.263 10.912 HELIX2: K 1 -0.696 13.041 22.357 L 2 1.152 11.081 23.082 M 3 2.200 17.590 18.424
What I must do is :
take each atom from each helix (for example atoms A, B, C, D, E from helix1 , atoms K,L,M from helix2) and calculate the distance from all other atoms of the
remaining helices.
An example :
1) Take atom A from helix1
2) Calculate distance with atom K from helix2
3) Calculate distance with atom L from helix2
4) Calculate distance with atom M from helix2
In order to calculate the distance ,for example, between A(helix1) and K(helix2), I use the formula:
distance = (Xa-Xk)^2 + (Ya-Yk)^2 + (Za-Zk)^2
So, what I need to store is :
x,y,z for each atom
then, using the formula above ,I will calculate the distance...
I don't know if I need a hash of arrays, an array of hashes or something more or less complicated... Any help?

Replies are listed 'Best First'.
Re: confused with distances
by Roy Johnson (Monsignor) on Mar 23, 2006 at 18:35 UTC
    Probably just an array of arrays for each helix. Make it an AoAoA if you want to have an array of helixes.
    my $helix_1 = [[-1.115, 8.537, 7.075], [-2.745, 5.280, 7.165], ...
    The first atom is $helix_1[0] and the z component (for example) of that would be $helix_1[0][2].

    Update: As johnqq notes, I need to make things agree. I'd probably do

    my @helix_1 = ([-1.115....

    Caution: Contents may have been coded under pressure.
      I think you probably need to de-reference $helix_1 as you have created a ref. to a list, not a list.

      The first atom should be $helix_1->[0] and the z component $helix_1->[0]->[2].

      There are other notations to do de-referencing but this is the style I find most readable.

      Cheers,

      JohnGG

Re: confused with distances
by duff (Parson) on Mar 23, 2006 at 18:40 UTC

    You can get as complicated as you want, but if you just need to calculate distances from atom to atom, then it don't really matter what helix they belong to, does it? So maybe

    %atoms = ( A => [ -1.115, 8.537, 7.075 ], B => [ -2.745, 5.280, 7.165 ], C => [ -0.777, 3.267, 7.329 ], D => [ 1.610, 5.225, 10.885 ], E => [ 0.296, 5.263, 10.912 ], K => [ -0.696, 13.041, 22.357 ], L => [ 1.152, 11.081, 23.082 ], M => [ 2.200, 17.590, 18.424 ], );
    is fine. But then I don't know what other problems you hope to solve. Structure your data in terms of the problems you need to solve, ignoring superfluous information.

    Update: As an example of making things more complicated, if you need to retain all of the information, maybe your data structure would look like this (hashes all the way down):

    %helixes = ( helix1 => { A => { x => -1.115, y => 8.537, z => 7.075 }, B => { x => -2.745, y => 5.280, z=> 7.165 }, ... }, helix2 => { K => { x => -0.696, y => 13.041, z => 22.357 }, L => { x => 1.152, y => 11.081, z => 23.082 }, M => { x => 2.200, y => 17.590, z => 18.424 }, }, );
      i suggest the arrayref with constants X,Y,Z
      use constant { X=>0, Y=>1, Z=>2};
      Then you can access the distance with:
      $helix1{A}[X];
Re: confused with distances
by Limbic~Region (Chancellor) on Mar 23, 2006 at 18:57 UTC
    Anonymous Monk,
    I think your distance formula is wrong. I believe you are supposed to take the square root of the sum of the deltas squared. In any account, here is working code.

    It uses an AoHoAoH. Since you haven't indicated if this is a programming assignment within your biology course I haven't commented the code. It is more complicated then it needs to be but that is so that it can easily be adapted to meet unstated requirements. Please ask questions if you don't understand.

    Cheers - L~R

Re: confused with distances
by swampyankee (Parson) on Mar 23, 2006 at 19:13 UTC

    If there are going to be many helices, I'd use a hash of hashes, where each element is an array reference:

    $atoms{HELIX1}{A} = [ ( -1.115, 8.537, 7.075)];

    While the above code is, doubtless, is less elegant and otherwise inferior to what those monks more fluent in Perl than I would write, it will process a small amount of data (say a few thousand points). Now, why would your assignment be to calculate the square of the distance: R2 = ∑(xi − xj)2?

    Do you really mean: R = √∑(xi − xj)2?

    emc

    " The most likely way for the world to be destroyed, most experts agree, is by accident. That's where we come in; we're computer professionals. We cause accidents."
    —Nathaniel S. Borenstein
Re: confused with distances
by ikegami (Patriarch) on Mar 23, 2006 at 18:51 UTC

    Each atom has 3 positions.
    You could use an associative array (i.e. hash) of positions (e.g. (x => $x, y => $y, z => $z)), but that's probably overkill.
    Each atom can easily be represented by an array of positions (e.g. ($x, $y, $z)).

    Each helix is made of atoms.
    If you wish to name your atoms, a helix can easily be represented by an associative array (i.e. hash) of atom (e.g. (K => $atom0, L => $atom1, M => $atom2)).
    If numerical indexes are sufficient identification for atoms, each helix can easily be represented by an array of atom (e.g. ($atom0, $atom1, $atom2)).
    Let's go with names, since you already have names for them.

    my %helix1 = ( A => [ -1.115, 8.537, 7.075 ], B => [ -2.745, 5.280, 7.165 ], C => [ -0.777, 3.267, 7.329 ], D => [ 1.610, 5.225, 10.885 ], E => [ 0.296, 5.263, 10.912 ], ); my %helix2 = ( K => [ -0.696, 13.041, 22.357 ], L => [ 1.152, 11.081, 23.082 ], M => [ 2.200, 17.590, 18.424 ], );

    Each distance is has two index (the 'from' atom in the first helix and the 'to' atom in the second helix), so you'll need a 2D structure. We (I) decided above to use named atoms, so your distance map will be a 2D hash (e.g. $distance{A}{K} = distance($helix1{A}, $helix2{K});).

    To populate %distance, think nested foreach loops. Good luck!