I have a set of X,Y coordinates on a 2D map. I'm trying to produce a set numbers (average, mean, stddev) of all edge lengths between neighbooring coordinates. An edge is defined as a line segemnt between two points. A neighboor is defined as an edge where no shorter edge that is defining a neighboor relationship from any possible combination of coordinates intersects it. [I think that's clear ;P]

The issue here is speed. This algorithm is O(N^3), and it doesn't take many coordinates to reach into the millions of caculations needed. For my example here, 301 coordinates, I'm finding I need 28882029 line segment intersection caculations. blech!

First I build all the unique edges. Then I loop through the edges sorted by length. Inside that I loop through all previously defined neighboor edges to determine if I have any intersection. If I don't intersect I push that edge onto the list of neigboors.

Here's a code example and a link to sample data:

example YAML dumped data: http://keep.drewhead.org/dump_301.dat

#!/usr/bin/perl ###################################################################### # CPAN Modules ###################################################################### use YAML qw(Load); use Data::Dumper; use Carp; use Benchmark; use strict; ###################################################################### # A few globals ###################################################################### my $points; # hash ref holding the raw data {'1' => [X,Y], '2' => [X,Y +], ...} my $edges; # Array ref to keep refrences to all edges [edge, X1, Y1, X +2, Y2] ... my $neighborEdges; # hash ref to hold sorted neighbor edges {'1' => ed +ge1, '2' => edge2, ...} my $precision = 7; my $delta = 10 ** (-$precision); ###################################################################### # # MAIN # ###################################################################### # Read in the X,Y coordinates from supplied data { local $/; open(my $fh, "dump_301.dat") or die "Unable to open file: $!\n"; $points = Load( <$fh> ); croak($@) if $@; close($fh); } my $start_pairs = Benchmark->new; my $edges = find_unique_edges($points); print "Unique edges ", scalar(@{$edges}), "\n"; print "Pairs parsed in :", timestr(timediff(Benchmark->new, $start_pai +rs)), "\n"; my $start_intersects = Benchmark->new; my $neighborEdges = find_unintersected_edges($edges); print "Unintersected edges ", scalar(keys %{$neighborEdges}), "\n"; print "Intersects parsed in :", timestr(timediff(Benchmark->new, $star +t_intersects)), "\n"; ###################################################################### ###################################################################### sub find_unique_edges { my $MAP = shift; my @edges; # returnable data structure my $start_distance = Benchmark->new; my $found = {}; # p1 is starting or anchor point of the line segment foreach my $p1 (@{$points}) { # p2 is end point of the line segment foreach my $p2 (@{$points}) { # We don't need to caculate if anchor and end are the same # or we have already seen these two pairs [reversed] before unless ( ($p1 == $p2) || $found->{$p2}->{$p1} ) { # Compute the edge my $edge = sqrt( ($p1->[0] - $p2->[0])**2 + ($p1->[1] - $p2->[1])**2 ); # Push the whole thing on a stack push (@edges, [ $edge, $p1->[0], $p1->[1], $p2->[0], $p2->[1] +]); # Keep track of the edges we've already computed (no need to d +o so twice) $found->{$p2}->{$p1} = 1; } } } return(\@edges); } ###################################################################### ###################################################################### sub find_unintersected_edges { my $PAIRS = shift; my $neighborEdges; # returnable data structure my $neighboorCNT = 1; foreach my $aref ( sort {$a->[0] <=> $b->[0] } @{$PAIRS}) { my $neighbor = 1; my $line_1_ref = []; push (@{$line_1_ref}, [$aref->[1],$aref->[2]]); # line 1 point A X +,Y push (@{$line_1_ref}, [$aref->[3],$aref->[4]]); # line 1 point B X +,Y for (my $n=1;$n<=$neighboorCNT;$n++) { if ($neighborEdges->{$n}) { my $points_ref; push (@{$points_ref}, @{$line_1_ref}); # line 1 points push (@{$points_ref}, [$neighborEdges->{$n}->[1],$neighborEdge +s->{$n}->[2]]); # line 2 point A X,Y push (@{$points_ref}, [$neighborEdges->{$n}->[3],$neighborEdge +s->{$n}->[4]]); # line 2 point B X,Y # If a intersect is found, set false and quit checking if (SegmentIntersection($points_ref)) { $neighbor = 0; last; } } } if ($neighbor) { $neighborEdges->{$neighboorCNT} = $aref; $neighboorCNT++; } } return($neighborEdges); } ###################################################################### +########## # # Shamlessly stolen from Math::Geometry::Planar sub SegmentIntersection { my $pointsref = $_[0]; my @points = @$pointsref; my @p1 = @{$points[0]}; # p1,p2 = segment 1 my @p2 = @{$points[1]}; my @p3 = @{$points[2]}; # p3,p4 = segment 2 my @p4 = @{$points[3]}; my $n1 = Determinant(($p3[0]-$p1[0]),($p3[0]-$p4[0]),($p3[1]-$p1[1]) +,($p3[1]-$p4[1])); my $n2 = Determinant(($p2[0]-$p1[0]),($p3[0]-$p1[0]),($p2[1]-$p1[1]) +,($p3[1]-$p1[1])); my $d = Determinant(($p2[0]-$p1[0]),($p3[0]-$p4[0]),($p2[1]-$p1[1]) +,($p3[1]-$p4[1])); if (abs($d) < $delta) { return 0; # parallel } if (!(($n1/$d < 1) && ($n2/$d < 1) && ($n1/$d > 0) && ($n2/$d > 0))) { return 0; } return(1); } ###################################################################### +########## # # The determinant for the matrix | x1 y1 | # | x2 y2 | # # args : x1,y1,x2,y2 # sub Determinant { my ($x1,$y1,$x2,$y2) = @_; return ($x1*$y2 - $x2*$y1); }

The correct answer is Unintersected edges 1778.

In this code I'm only trying to find the unique edges. Once I have that the statistics I need to caculate are trivial.
Also I took the code from Math::Geometry::Planar and tweaked the SegmentIntersection routine to return a boolean. As written it returned the intersection point which was more than I actually needed.

Devel::Profile tell me this:

%Time Sec. #calls sec/call F name 58.21 1889.8621 28882029 0.000065 main::SegmentIntersection 30.60 993.5924 1 993.592408 main::find_unintersected_edge +s 11.12 361.0416 86646087 0.000004 main::Determinant 0.05 1.7613 1 1.761285 main::find_unique_edges
So I'm spending a ton of time doing the math and looping through the edge lists. I'm looking for suggestions of things to speed up this algorithm, though I'm willing to entertain ideas of differing algorithms. :)

In reply to Millions of line segment intersection calcs: Looking for speed tips by drewhead

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.