in reply to Data visualisation.

BrowserUk:

I think I have my code in fairly decent shape, functionality-wise. (It's still ugly right now, though.)

However, to get most of the points in your dataset located, I had to open up the tolerance to accept up to a 15% error. (It misses only 3 points at that setting.)

Do you have a way to check whether the coordinates look reasonable against your original data? I'm curious at how good the program works. I've still got some testing to do, but so far, it's been giving decent results with my test sets.

I've uglified the code a bit more to draw the coordinates in a bitmap so I could do a visual check with my original coordinate sets, and it comes out good on them (three different point sets).

Code in readmore tags...

#!/usr/bin/perl # # trv_slsmn_triangulate.pl <FName> # # Given the distance matrix, compute a set of 2D locations for the p +oints that yields # the specified distance map. # # Since any coordinates will do, we'll first set point A at (0,0), a +nd point B at (aDb,0). # Finally, we'll make sure the angle of point C is counterclockwise +from the line AB. # Obviously, we have to be sure that aDc+bDc > aDb, or it won't make + a triangle, so we # have to use a bit of care to select our first points. # use strict; use warnings; use Image::Magick; my $tolerance = 0.15; # 5% error my @Quadrants = ( [1, 1], [1, -1], [-1, 1], [-1, -1] ); # Prepare data source my $FName = shift; my $IFH; if (defined $FName) { open $IFH, '<', $FName; } else { $IFH = \*DATA; } # Read data my %D; my ($R, $C) = (0); my $max_dist = [ 0 ]; my @LOCs; while (<$IFH>) { my @t = split /\s+/, $_; $C=0; $LOCs[$R] = [ ]; while (@t) { my ($dist, $a, $b); $dist = shift @t; ($a,$b) = ($R>$C) ? ($C, $R) : ($R, $C); if ($a != $b) { if (exists $D{$a}{$b}) { die "Mismatch ($R,$C)=$dist, but ($C,$R)=$D{$a}{$b}\n" + if $dist != $D{$a}{$b}; } $D{$a}{$b} = $dist; if ($dist > $$max_dist[0]) { $max_dist = [ $dist, $a, $b ]; } } ++$C; } ++$R; } # OK, points A and B are the points in max_dist, set their initial loc +ations: my ($ptA, $ptB) = ($max_dist->[1], $max_dist->[2]); $LOCs[$ptA] = [0, 0]; $LOCs[$ptB] = [$max_dist->[0], 0]; print "Point A is $ptA (0, 0); point B is $ptB ($max_dist->[0], 0)\n"; my $dAB = dist($ptA, $ptB); # For point C, we'll find the point giving the triangle with the large +st area. # Presumably, that'll get us a point sufficiently far away from the AB + baseline # to give us the best ability to resolve the other points. $max_dist = [ 0 ]; for my $i (0 .. $#LOCs) { next if defined $LOCs[$i][0]; # skip A and B my ($dCA, $dCB) = (dist($i,$ptA), dist($i,$ptB) ); my $s = ($dCA+$dCB+$dAB)/2.0; my $dsq = sqrt( $s * ($s-$dCA) * ($s-$dCB) * ($s-$dAB) ); #print "$i: dCA=$dCA, dCB=$dCB, Area=$dsq"; if ($dsq > $max_dist->[0]) { $max_dist = [$dsq, $i]; #print " new best!"; } #print "\n"; } my $ptC = $max_dist->[1]; # Find position of c via law of cosines (CAB) and sin(arccos x)=sqrt(1 +-x**2) my ($dCA, $dCB) = (dist($ptC,$ptA), dist($ptC,$ptB)); print "dCA=$dCA, dAB=$dAB, dCB=$dCB\n"; my $cosCAB = ($dCA*$dCA + $dAB*$dAB - $dCB*$dCB) / (2*$dCA*$dAB); my $sinCAB = sqrt(1 - $cosCAB**2); $LOCs[$ptC] = [ $dCA * $cosCAB, $dCA * $sinCAB ]; print "Point C is $ptC ($LOCs[$ptC][0], $LOCs[$ptC][1])\n"; #dump_LOCs(); #dump_distance_map(); # remaining locations can be found by triangulation from the three kno +wn points for my $i (0 .. $#LOCs) { next if $i==$ptA or $i==$ptB or $i==$ptC; my ($dIA, $dIB, $dIC) = (dist($i,$ptA), dist($i,$ptB), dist($i,$pt +C)); if ($dIA+$dIB < $dAB) { print "$i: ($dIA, $dIB, $dIC) *** Fails triangle inequality ** +*\n"; next; } print "$i: ($dIA, $dIB, $dIC)\n"; my $cosIAB = ($dIA*$dIA + $dAB*$dAB - $dIB*$dIB) / (2*$dIA*$dAB); my $sinIAB = sqrt(1 - $cosIAB**2); my @best = (1000000); for my $q (@Quadrants) { my ($x_sign, $y_sign) = @$q; my ($x, $y) = ($dIA*$cosIAB*$x_sign, $dIA*$sinIAB*$y_sign); my $check = sqrt( ($LOCs[$ptC][0]-$x)**2 + ($LOCs[$ptC][1]-$y) +**2 ); my $err = abs($check-$dIC); my $err_pct = 100*$err / $dIC; if ($err < $dIC*$tolerance) { if ($err < $best[0]) { @best = ($err, $err_pct, $x, $y); } print "\t($x, $y) <case 1: $err, $err_pct>\n"; $LOCs[$i] = [$x, $y]; #last; } else { print "\t($x, $y) miss ($dIC, $err)\n"; } } $LOCs[$i] = [ $best[2], $best[3] ] if @best>1; } dump_LOCs(); my ($x_min, $x_max, $y_min, $y_max); ($x_min,$y_min) = ($x_max,$y_max) = @{$LOCs[0]}; for my $i (1 .. $#LOCs) { my ($x, $y) = @{$LOCs[$i]}; next if ! defined $y; $x_min = $x if $x_min>$x; $y_min = $y if $y_min>$y; $x_max = $x if $x_max<$x; $y_max = $y if $y_max<$y; } print "($x_min,$y_min) - ($x_max,$y_max)\n"; my ($x_offs, $y_offs, $x_scale, $y_scale); $x_scale = 1400 / ($x_max - $x_min); $y_scale = 900 / ($y_max - $y_min); $x_offs = 50 - $x_min; $y_offs = 50 - $y_min; if ($x_scale < $y_scale) { ($x_scale, $y_scale) = ($x_scale, $x_scale); } else { ($x_scale, $y_scale) = ($y_scale, $y_scale); } print "$x_scale . $y_scale; $x_offs, $y_offs\n"; for my $i (0 .. $#LOCs) { if (defined $LOCs[$i][0]) { my ($x,$y) = @{$LOCs[$i]}; my ($x1,$y1,$x2,$y2) = ( $x_scale*($x+$x_offs), 1000-$y_scale*($y+$y_offs) ); $LOCs[$i][2]=$x1; $LOCs[$i][3]=$y1; } } my $q = Image::Magick->new; $q->Set(size=>'1500x1000'); $q->Read('gradient:#000185-#0053f8'); my $pts="$LOCs[$ptA][2],$LOCs[$ptA][3] $LOCs[$ptB][2],$LOCs[$ptB][3]"; $q->Draw(stroke=>'white', fill=>'white', primitive=>'line', points=>$p +ts); $pts="$LOCs[$ptA][2],$LOCs[$ptA][3] $LOCs[$ptC][2],$LOCs[$ptC][3]"; $q->Draw(stroke=>'red', fill=>'red', primitive=>'line', points=>$pts); $pts="$LOCs[$ptB][2],$LOCs[$ptB][3] $LOCs[$ptC][2],$LOCs[$ptC][3]"; $q->Draw(stroke=>'green', fill=>'green', primitive=>'line', points=>$p +ts); for my $i (0 .. $#LOCs) { if (defined $LOCs[$i][0]) { my (undef, undef, $x,$y) = @{$LOCs[$i]}; my ($x1,$y1,$x2,$y2) = ($x-5, $y-5, $x+5, $y+5); my $fill = 'black'; $fill = 'red' if $i eq $ptA; $fill = 'green' if $i eq $ptB; $fill = 'yellow' if $i eq $ptC; my $pts = "$x1,$y1 $x2,$y2"; print "$i: $pts\n"; $q->Draw(stroke=>'white', fill=>$fill, primitive=>'rectangle', + points=>$pts); } } $q->Write('map.gif'); sub dist { my ($r, $c) = @_; return 0 if $r eq $c; if (defined($LOCs[$r][0]) and defined($LOCs[$c][0]) ) { return sqrt( ($LOCs[$r][0]-$LOCs[$c][0])**2 + ($LOCs[$r][1]-$L +OCs[$c][1])**2 ); } ($r, $c) = ($c, $r) if $c < $r; die "$r,$c entry DNE?\n" unless exists $D{$r}{$c}; $D{$r}{$c}; } sub dump_LOCs { for my $i (0 .. $#LOCs) { print dump_helper($i), ": "; if (defined($LOCs[$i][0])) { print "($LOCs[$i][0], $LOCs[$i][1])\n"; } else { print "undef\n"; } } } sub dump_helper { my $i = shift; return "<A>" if $i eq $ptA; return "<B>" if $i eq $ptB; return "<C>" if $i eq $ptC; return "<$i>"; } sub dump_distance_map { print " "; printf "%6s ", dump_helper($_) for 0 .. $#LOCs; print "\n"; for my $i (0 .. $#LOCs) { printf "%-4.4s", dump_helper($i); for my $j (0 .. $#LOCs) { my $d = dist($i,$j); printf "% 6u ", $d; } print "\n"; } } # Matrix shows distance from pt on left to dest column __DATA__ 0 3812 13902 8619 15811 5015 5230 9615 10624 13346 7575 6170 6 +812 9487 18135 8030 5409 17959 12822 17136 3267 12882 11223 11078 3812 0 11527 12431 15446 8057 4519 8761 14398 12569 3764 6668 + 10603 11117 14805 5154 8276 18175 9367 14840 7056 9698 13603 7 +266 13902 11527 0 13638 10220 18405 8675 4611 12993 11970 8798 822 +6 15087 16591 6859 6381 11223 7602 3236 3457 14535 8830 14748 +6655 8619 12431 13638 0 9965 5555 11256 11549 2157 10917 16194 9609 + 1926 7111 12565 14906 5737 9378 16868 11899 5402 15921 5765 19 +675 15811 15446 10220 9965 0 11122 18683 14599 7873 2940 13119 1779 +3 11057 6374 3517 14973 15565 3627 10307 7077 14478 6475 5155 1 +0235 5015 8057 18405 5555 11122 0 10109 13856 6744 9617 11348 10270 + 3811 4858 14594 12975 7186 13212 17301 17106 3881 12658 6210 14 +138 5230 4519 8675 11256 18683 10109 0 4584 13284 16923 6018 2299 + 10267 14709 15205 3741 5549 15863 8377 11981 6865 12543 16177 8 +600 9615 8761 4611 11549 14599 13856 4584 0 12503 16548 8240 3619 + 11888 18524 11444 4627 6948 11282 6102 7523 9992 12321 16782 8 +550 10624 14398 12993 2157 7873 6744 13284 12503 0 9231 18070 1140 +5 3813 6447 10427 16699 7735 7340 15982 10421 7492 14151 4374 1 +8092 13346 12569 11970 10917 2940 9617 16923 16548 9231 0 11052 1922 +0 11199 4874 5269 14093 16358 6543 10709 9458 13488 5008 5170 +9146 7575 3764 8798 16194 13119 11348 6018 8240 18070 11052 0 8220 + 14354 12490 11186 3614 11306 14452 5954 11533 10801 6649 14987 3 +508 6170 6668 8226 9609 17793 10270 2299 3619 11405 19220 8220 0 + 9124 15115 15059 5330 3975 14165 9100 10994 6469 14453 15268 10 +326 6812 10603 15087 1926 11057 3811 10267 11888 3813 11199 14354 9124 + 0 6645 14126 13999 5168 11153 17968 13808 3728 15782 6168 17 +749 9487 11117 16591 7111 6374 4858 14709 18524 6447 4874 12490 15115 + 6645 0 9754 15942 11588 9288 15400 13331 8652 9157 2621 12 +782 18135 14805 6859 12565 3517 14594 15205 11444 10427 5269 11186 1505 +9 14126 9754 0 11725 16484 3379 6883 4250 17835 5311 8643 +7729 8030 5154 6381 14906 14973 12975 3741 4627 16699 14093 3614 5330 + 13999 15942 11725 0 9173 13841 4874 9780 10451 9122 18555 5 +003 5409 8276 11223 5737 15565 7186 5549 6948 7735 16358 11306 3975 + 5168 11588 16484 9173 0 13413 12955 12642 3450 17931 11293 14 +142 17959 18175 7602 9378 3627 13212 15863 11282 7340 6543 14452 1416 +5 11153 9288 3379 13841 13413 0 9181 4145 14775 8581 7147 1 +0945 12822 9367 3236 16868 10307 17301 8377 6102 15982 10709 5954 910 +0 17968 15400 6883 4874 12955 9181 0 5593 15242 6278 15447 +3419 17136 14840 3457 11899 7077 17106 11981 7523 10421 9458 11533 1099 +4 13808 13331 4250 9780 12642 4145 5593 0 15914 8478 11291 +8453 3267 7056 14535 5402 14478 3881 6865 9992 7492 13488 10801 6469 + 3728 8652 17835 10451 3450 14775 15242 15914 0 15624 9326 14 +308 12882 9698 8830 15921 6475 12658 12543 12321 14151 5008 6649 1445 +3 15782 9157 5311 9122 17931 8581 6278 8478 15624 0 10157 +4139 11223 13603 14748 5765 5155 6210 16177 16782 4374 5170 14987 1526 +8 6168 2621 8643 18555 11293 7147 15447 11291 9326 10157 0 1 +4265 11078 7266 6655 19675 10235 14138 8600 8550 18092 9146 3508 1032 +6 17749 12782 7729 5003 14142 10945 3419 8453 14308 4139 14265 + 0

...roboticus

When your only tool is a hammer, all problems look like your thumb.

Replies are listed 'Best First'.
Re^2: Data visualisation.
by BrowserUk (Patriarch) on Jan 04, 2014 at 08:44 UTC
    Do you have a way to check whether the coordinates look reasonable against your original data? I'm curious at how good the program works.

    Sorry for the delay in getting back to you Roboticus, but I was caught up with my own efforts. I tried C&ping my dataset into your code but something broke and I couldn't immediately see what.

    I finally just got back to it and it was simply that you use split /\s+/ rather than split ' ' and my data had leading whitespace which screwed everything up. Fixed that, ran it and got:

    Those undefs are the points you had trouble with I assume. Everyone has found problems with the dataset -- it comes from TSPLIB and there was never any guarantee that it was a plottable dataset; that's part of the reason for whating to try and visualise it.

    However, everyone seems to have problems with different points.

    I eventually gave up with the Law of Cosines method. Not only are there four quadrants that each point could be in, there are two points (+-y) that need to be considered. Instead I went the Circle-Circle intersection route, which proved to be far simpler.

    This animated gif shows the problem with the dataset quite nicely. The green arcs are the distances from B & P. The cyan arc is from the 3rd point (in this case the A point), which is used to decide (nearest wins) which of the two intersect points of the green arcs is chosen. It also highlights the accuracy or lack thereof of the correspondence between them.

    It shows that -- with A as the third point -- D is a particularly bad fit; but chose a different 3rd reference point and D might be spot on but some other previously good fit point becomes bad.

    Anyway, many thanks for your code -- you're first attempt was the cluebat I needed to get started. I've added my code below, but it is also very obfuscated with the image generation code. I need to separate the two, but it was very useful when coding.

    My code:


    With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority".
    In the absence of evidence, opinion is indistinguishable from prejudice.