jcklasseter has asked for the wisdom of the Perl Monks concerning the following question:

So, I'm trying to compute the distance between 2 points in an XYZ format. I'm not quite sure if I'm calling the distance wrong, or if my formula is messed up, or all of the above and anything else. My input file looks like this:
1 1 1 2 2 2 3 3 3 4 4 4 5 5 5
So far, this is what I have.
#!/usr/bin/perl use strict; use warnings; use diagnostics; my $source = "./CONTCAR"; my $destination = "./OUTPUT"; open(IN, '<', $source) or die "Couldn't open $source: $!\n"; open(OUT, '>', $destination) or die "Couldn't write to $destinatio +n: $!\n"; my @data = map [ split ], grep /\S/, <IN>; foreach (@data) { print "$_\n"; } for my $i (0 .. $#data) { for my $j (0 .. $#data) { next if $i == $j; my @coords_i = @{$data[$i]}[0,1,2]; my @coords_j = @{$data[$j]}[0,1,2]; printf OUT "%s to %s Distance=%.5f\n", $data[$i][0], $data[$j][0], $data[$i][2], $data[$j][2], distance(\@coords_i, \@coords_j); } } sub distance { my ($aa, $bb) = @_; my ($x, $y, $z) = map { $aa->[$_] - $bb->[$_] } 0 .. $#$aa; return sqrt(($x - $x)**2 + ($y - $y)**2 + ($z - $z)**2); } close IN; close OUT; print "Done.\n";
In the output file, this :
1 to 2 Distance=1.00000 1 to 3 Distance=1.00000 1 to 4 Distance=1.00000 1 to 5 Distance=1.00000 2 to 1 Distance=2.00000 2 to 3 Distance=2.00000 2 to 4 Distance=2.00000 2 to 5 Distance=2.00000 3 to 1 Distance=3.00000 3 to 2 Distance=3.00000 3 to 4 Distance=3.00000 3 to 5 Distance=3.00000 4 to 1 Distance=4.00000 4 to 2 Distance=4.00000 4 to 3 Distance=4.00000 4 to 5 Distance=4.00000 5 to 1 Distance=5.00000 5 to 2 Distance=5.00000 5 to 3 Distance=5.00000 5 to 4 Distance=5.00000
is printed. Any help is appreciated.

Replies are listed 'Best First'.
Re: Subroutine not correct (I think)
by Corion (Patriarch) on Jun 17, 2015 at 12:59 UTC
    printf OUT "%s to %s Distance=%.5f\n", $data[$i][0], $data[$j][0], $data[$i][2], $data[$j][2], distance(\@coords_i, \@coords_j);

    Your printf statement has three placeholders, but you supply five elements of data to be printed.

    return sqrt(($x - $x)**2 + ($y - $y)**2 + ($z - $z)**2);

    The expression $x - $x will always be zero, as will the others.

      Aren't the printf placeholders for the 3 columns of data? I'm not sure how to fix the formula. I don't know how to iterate very well, at all. Any suggestions?

        There are five items of data in the list passed to sprintf, not three. They are:

        1. $data[$i][0]
        2. $data[$j][0]
        3. $data[$i][2]
        4. $data[$j][2]
        5. distance(\@coords_i, \@coords_j)

        Therefore, your format string in the sprintf call should have enough placeholders (five!) to match this list and not merely the three currently present.

Re: Subroutine not correct (I think)
by jdporter (Paladin) on Jun 17, 2015 at 14:09 UTC
    use List::Util qw( sum ); use List::MoreUtils qw( pairwise ); sub distance { my( $aa, $bb ) = @_; $aa == $bb and return 0; # optimiz. @$aa == @$bb or die "coordinate vectors unequal length!"; sqrt sum pairwise { ( $a - $b ) ** 2 } @$aa, @$bb }
Re: Subroutine not correct (I think)
by pme (Monsignor) on Jun 17, 2015 at 13:08 UTC
    Hi jcklasseter,

    I am not really understand what would you like to do with your input data. But I can point out two things.

    printf has three format specifier and five values.

    printf OUT "%s to %s Distance=%.5f\n", $data[$i][0], $data[$j][0], $data[$i][2], $data[$j][2], distance(\@coords_i, \@coords_j);

    ($x - $x), ($y - $y) and ($z - $z) are always equal to zero.

    return sqrt(($x - $x)**2 + ($y - $y)**2 + ($z - $z)**2);
      The coming up to zero part makes a lot of sense. Do you have any idea how I would start fixing that?
        This piece of code may do something similar what you need.
        use strict; use warnings; use diagnostics; sub distance { my ($aa, $bb) = @_; return sqrt(($aa->[0] - $bb->[0])**2 + ($aa->[1] - $bb->[1])**2 + ($aa->[2] - $bb->[2])**2); } my $source = "./CONTCAR"; open(IN, '<', $source) or die "Couldn't open $source: $!\n"; my @data = map [ split ], grep /\S/, <IN>; foreach my $d1 (@data) { foreach my $d2 (@data) { printf "($d1->[0],$d1->[1],$d1->[2]) <-> ($d2->[0],$d2->[1],$d +2->[2]) = %f\n", distance($d1, $d2); } } close IN; print "Done.\n";

        jcklasseter:

        Yeah, just get rid of the -$* part. Normally, you'd compute the distance like:

        return sqrt( ($x2 - $x1)**2 + ($y2 - $y1)**2 + ($z2 - $z1)**2 );

        But another way to think of it would be:

        my $dx = $x2 - $x1; my $dy = $y2 - $y1; my $dz = $z2 - $z1; return sqrt( $dx**2 + $dy**2 + $dz**2 );

        Your map statement is computing dx, dy and dz (you've just labelled them as x, y and z), so after your map statement, you can just use:

        return sqrt( $x**2 + $y**2 + $z**2 );

        Though I'd suggest renaming the variables x, y and z to clarify the code.

        ...roboticus

        When your only tool is a hammer, all problems look like your thumb.

Re: Subroutine not correct (I think)
by QM (Parson) on Jun 17, 2015 at 13:25 UTC
    In line 34 you have already computed the differences. Just square the values:
    return sqrt($x**2 + $y**2 + $z**2);

    -QM
    --
    Quantum Mechanics: The dreams stuff is made of

      Unfortunately, that gave me the same output as before.

        What steps have you taken to inspect the return value of your distance() function?

        Also consider re-reading sprintf and printf and the replies you already got. Many of them addressed two problems in your program, but changing the formula is only one problem.

Re: Subroutine not correct (I think)
by QM (Parson) on Jun 17, 2015 at 14:00 UTC
    Update: Fixed loop variables properly.

    OK, fixing it up. I've commented out the input/output file stuff, and just used <DATA> at the end for convenience:

    #!/usr/bin/perl use strict; use warnings; use diagnostics; #my $source = "./CONTCAR"; #my $destination = "./OUTPUT"; # #open(IN, '<', $source) or die "Couldn't open $source: $!\n"; #open(OUT, '>', $destination) or die "Couldn't write to $destination: +$!\n"; my @data = map [ split ], grep /\S/, <DATA>; foreach (@data) { print "@$_\n"; } for my $i (0 .. $#data-1) { for my $j ($i+1 .. $#data) { # next if $i == $j; my @coords_i = @{$data[$i]}[0,1,2]; my @coords_j = @{$data[$j]}[0,1,2]; # printf OUT "%s to %s Distance=%.5f\n", printf "(%s,%s) to (%s,%s) Distance=%.5f\n", $data[$i][0], $data[$j][0], $data[$i][2], $data[$j][2], distance(\@coords_i, \@coords_j); } } sub distance { my ($aa, $bb) = @_; my ($x, $y, $z) = map { $aa->[$_] - $bb->[$_] } 0 .. $#$aa; return sqrt($x**2 + $y**2 + $z**2); } # close IN; # close OUT; # print "Done.\n"; exit; __DATA__ 1 1 1 2 2 2 3 3 3 4 4 4 5 5 5

    Which outputs:

    1 1 1 2 2 2 3 3 3 4 4 4 5 5 5 (1,2) to (1,2) Distance=1.73205 (1,3) to (1,3) Distance=3.46410 (1,4) to (1,4) Distance=5.19615 (1,5) to (1,5) Distance=6.92820 (2,3) to (2,3) Distance=1.73205 (2,4) to (2,4) Distance=3.46410 (2,5) to (2,5) Distance=5.19615 (3,4) to (3,4) Distance=1.73205 (3,5) to (3,5) Distance=3.46410 (4,5) to (4,5) Distance=1.73205

    -QM
    --
    Quantum Mechanics: The dreams stuff is made of

Re: Subroutine not correct (I think)
by BillKSmith (Monsignor) on Jun 17, 2015 at 14:53 UTC
    You are computing every distance twice. It probably is better to keep your results in an array and take advantage of symmetry.
    use strict; use warnings; use Data::Dumper; my @data; while (<DATA>) { next if !/\S/; push @data, [split]; } my @array; for my $i (0..$#data) { $array[$i][$i] = 0; for my $j ($i+1..$#data) { $array[$j][$i] = $array[$i][$j] = distance( $data[$i], $data[$ +j] ); } } print Dumper( \@array ); sub distance { my ($p1, $p2) = @_; return sqrt ( ($p1->[0] - $p2->[0]) ** 2 + ($p1->[1] - $p2->[1]) ** 2 + ($p1->[2] - $p2->[2]) ** 2 ); } __DATA__ 1 1 1 2 2 2 3 3 3 4 4 4 5 5 5
    Bill
Re: Subroutine not correct (I think)
by jcklasseter (Sexton) on Jun 17, 2015 at 12:48 UTC
    There was another post by me that was similar, but I completely messed it up and destroyed the conversation/context, so it was recommended to me to start a new question.