comment on

I have a set of X,Y coordinates on a 2D map. I'm trying to produce a set numbers (average, mean, stddev) of all edge lengths between neighbooring coordinates. An edge is defined as a line segemnt between two points. A neighboor is defined as an edge where no shorter edge that is defining a neighboor relationship from any possible combination of coordinates intersects it. [I think that's clear ;P]

The issue here is speed. This algorithm is O(N^3), and it doesn't take many coordinates to reach into the millions of caculations needed. For my example here, 301 coordinates, I'm finding I need 28882029 line segment intersection caculations. blech!

First I build all the unique edges. Then I loop through the edges sorted by length. Inside that I loop through all previously defined neighboor edges to determine if I have any intersection. If I don't intersect I push that edge onto the list of neigboors.

Here's a code example and a link to sample data:

example YAML dumped data: http://keep.drewhead.org/dump_301.dat

#!/usr/bin/perl

######################################################################
# CPAN Modules
######################################################################

use YAML qw(Load);
use Data::Dumper;
use Carp;
use Benchmark;

use strict;

######################################################################
# A few globals
######################################################################

my $points; # hash ref holding the raw data {'1' => [X,Y], '2' => [X,Y
+], ...}
my $edges; # Array ref to keep refrences to all edges [edge, X1, Y1, X
+2, Y2] ...
my $neighborEdges; # hash ref to hold sorted neighbor edges {'1' => ed
+ge1, '2' => edge2, ...}

my $precision = 7;
my $delta = 10 ** (-$precision);

######################################################################
#
#       MAIN
#
######################################################################

# Read in the X,Y coordinates from supplied data
{
  local $/;
  open(my $fh, "dump_301.dat") or die "Unable to open file: $!\n";
  $points = Load( <$fh> );
  croak($@) if $@;
  close($fh);
}

my $start_pairs = Benchmark->new;
my $edges = find_unique_edges($points);
print "Unique edges ", scalar(@{$edges}), "\n";
print "Pairs parsed in :", timestr(timediff(Benchmark->new, $start_pai
+rs)), "\n";

my $start_intersects = Benchmark->new;
my $neighborEdges = find_unintersected_edges($edges);
print "Unintersected edges ", scalar(keys %{$neighborEdges}), "\n";
print "Intersects parsed in :", timestr(timediff(Benchmark->new, $star
+t_intersects)), "\n";

######################################################################
######################################################################

sub find_unique_edges
{

  my $MAP = shift;

  my @edges; # returnable data structure

  my $start_distance = Benchmark->new;
  my $found = {};

  # p1 is starting or anchor point of the line segment
  foreach my $p1 (@{$points}) {

    # p2 is end point of the line segment
    foreach my $p2 (@{$points}) {

      # We don't need to caculate if anchor and end are the same
      # or we have already seen these two pairs [reversed] before    
      unless ( ($p1 == $p2) || $found->{$p2}->{$p1} ) {

        # Compute the edge
        my $edge = sqrt(
          ($p1->[0] - $p2->[0])**2 +
          ($p1->[1] - $p2->[1])**2
        );

        # Push the whole thing on a stack
        push (@edges, [ $edge, $p1->[0], $p1->[1], $p2->[0], $p2->[1] 
+]);

        # Keep track of the edges we've already computed (no need to d
+o so twice)
        $found->{$p2}->{$p1} = 1;

      }
    }
  }

  return(\@edges);
}

######################################################################
######################################################################

sub find_unintersected_edges
{

  my $PAIRS = shift;

  my $neighborEdges; # returnable data structure
  my $neighboorCNT = 1;
  foreach my $aref ( sort {$a->[0] <=> $b->[0] } @{$PAIRS}) {

    my $neighbor = 1;

    my $line_1_ref = [];
    push (@{$line_1_ref}, [$aref->[1],$aref->[2]]); # line 1 point A X
+,Y
    push (@{$line_1_ref}, [$aref->[3],$aref->[4]]); # line 1 point B X
+,Y

    for (my $n=1;$n<=$neighboorCNT;$n++) {
      if ($neighborEdges->{$n}) {

        my $points_ref;
        push (@{$points_ref}, @{$line_1_ref}); # line 1 points
        push (@{$points_ref}, [$neighborEdges->{$n}->[1],$neighborEdge
+s->{$n}->[2]]); # line 2 point A X,Y
        push (@{$points_ref}, [$neighborEdges->{$n}->[3],$neighborEdge
+s->{$n}->[4]]); # line 2 point B X,Y

        # If a intersect is found, set false and quit checking
        if (SegmentIntersection($points_ref)) { 
          $neighbor = 0;
          last;
        }

      }
    }


    if ($neighbor) { 
      $neighborEdges->{$neighboorCNT} = $aref;
      $neighboorCNT++;
    }
  }

  return($neighborEdges);
}

######################################################################
+##########
#
# Shamlessly stolen from Math::Geometry::Planar

sub SegmentIntersection {
  my $pointsref = $_[0];
  my @points = @$pointsref;
  my @p1 = @{$points[0]}; # p1,p2 = segment 1
  my @p2 = @{$points[1]};
  my @p3 = @{$points[2]}; # p3,p4 = segment 2
  my @p4 = @{$points[3]};

  my $n1 = Determinant(($p3[0]-$p1[0]),($p3[0]-$p4[0]),($p3[1]-$p1[1])
+,($p3[1]-$p4[1]));
  my $n2 = Determinant(($p2[0]-$p1[0]),($p3[0]-$p1[0]),($p2[1]-$p1[1])
+,($p3[1]-$p1[1]));
  my $d  = Determinant(($p2[0]-$p1[0]),($p3[0]-$p4[0]),($p2[1]-$p1[1])
+,($p3[1]-$p4[1]));
  if (abs($d) < $delta) {
    return 0; # parallel
  }
  if (!(($n1/$d < 1) && ($n2/$d < 1) &&
        ($n1/$d > 0) && ($n2/$d > 0))) {
    return 0;
  }
  return(1);
}

######################################################################
+##########
#  
#  The determinant for the matrix  | x1 y1 |
#                                  | x2 y2 |
#
# args : x1,y1,x2,y2
#

sub Determinant {
  my ($x1,$y1,$x2,$y2) = @_;
  return ($x1*$y2 - $x2*$y1);
}
[download]

The correct answer is Unintersected edges 1778.

In this code I'm only trying to find the unique edges. Once I have that the statistics I need to caculate are trivial.
Also I took the code from Math::Geometry::Planar and tweaked the SegmentIntersection routine to return a boolean. As written it returned the intersection point which was more than I actually needed.

Devel::Profile tell me this:

%Time    Sec.     #calls   sec/call  F  name
58.21 1889.8621  28882029   0.000065     main::SegmentIntersection
30.60  993.5924        1  993.592408     main::find_unintersected_edge
+s
11.12  361.0416  86646087   0.000004     main::Determinant
 0.05    1.7613        1   1.761285     main::find_unique_edges
[download]

So I'm spending a ton of time doing the math and looping through the edge lists. I'm looking for suggestions of things to speed up this algorithm, though I'm willing to entertain ideas of differing algorithms. :)

In reply to Millions of line segment intersection calcs: Looking for speed tips by drewhead

Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!

Titles consisting of a single word are discouraged, and in most cases are disallowed outright.

Read Where should I post X? if you're not absolutely sure you're posting in the right place.

Please read these before you post! —

Posts may use any of the Perl Monks Approved HTML tags:

a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr

You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)

	For:		Use:
	&		`&`
	<		`<`
	>		`>`
	[		`[`
	]		`]`

Link using PerlMonks shortcuts! What shortcuts can I use for linking?

See Writeup Formatting Tips and other pages linked from there for more info.