Re: Millions of line segment intersection calcs: Looking for speed tips

I suspect that there's probably a way to optimize the algorithm, but I really can't think of what it would be. However, there are numerous little changes that might help your performance a bit (though probably nothing too dramatic.

Why are $points and $neighborEdges hashrefs mapping integers (as strings) to values? It seems like it would be much more natural to make them simply arrayrefs (or just arrays). Then you wouldn't need neighboorCNT. So then instead of
```
for (my $n=1;$n<=$neighboorCNT;$n++) {
      if ($neighborEdges->{$n}) {
[download]
```
you get
```
for my $n (@$neighborEdges){
[download]
```
and when instead of using $neighborEdges->{$n} you'd just use $n. That would save a lot of hash accesses.
Very minor point, but it looks like you don't actually care about the real distance between points. Instead, you care about relative distances. So you can store the square of the distances (by not doing the sqrt) and get the same results.
Since Determinant is called so many times, I wonder whether you'd get any improvement over rewriting it without the temporary variables.
```
sub Determinant {
  return ($_[0]*$_[3] - $_[2]*$_[1]);
}
[download]
```

I'd probably try rethinking SegmentIntersection. Here's an unbenchmarked (and untested) version that tries to save on some of the divisions, save calls you don't need, etc. Basically, if $d is zero, we don't even need $n1 or $n2. For the inequalities, instead of dividing by $d again and again, we can multiply both sides of the inequality by $d (and flip the inequality if $d is less than zero).

sub SegmentIntersection {
  my @points = @{$_[0]};
  my @p1 = @{$points[0]}; # p1,p2 = segment 1
  my @p2 = @{$points[1]};
  my @p3 = @{$points[2]}; # p3,p4 = segment 2
  my @p4 = @{$points[3]};

  my $d  = Determinant(($p2[0]-$p1[0]),($p3[0]-$p4[0]),
                       ($p2[1]-$p1[1]),($p3[1]-$p4[1]));
  if (abs($d) < $delta) {
    return 0; # parallel
  }

  my $n1 = Determinant(($p3[0]-$p1[0]),($p3[0]-$p4[0]),
                       ($p3[1]-$p1[1]),($p3[1]-$p4[1]));
  my $n2 = Determinant(($p2[0]-$p1[0]),($p3[0]-$p1[0]),
                       ($p2[1]-$p1[1]),($p3[1]-$p1[1]));

  if ($d > 0){
    return $n1 < $d && $n2 < $d && $n1 > 0 && $n2 > 0;
  } else {
    return $n1 > $d && $n2 > $d && $n1 < 0 && $n2 < 0;
  }
}
[download]

All small things, but they might help a little.

Comment on Re: Millions of line segment intersection calcs: Looking for speed tips Select or Download Code

Replies are listed 'Best First'.
Re^2: Millions of line segment intersection calcs: Looking for speed tips by Anonymous Monk on Aug 03, 2005 at 16:37 UTC
Might also try inlining Determinant and some common sub-expression elimination... sub SegmentIntersection { my @points = @{$_[0]}; my @p1 = @{$points[0]}; # p1,p2 = segment 1 my @p2 = @{$points[1]}; my @p3 = @{$points[2]}; # p3,p4 = segment 2 my @p4 = @{$points[3]}; my $a = $p2[0] - $p1[0]; my $b = $p3[1] - $p4[1]; my $c = $p2[1] - $p1[1]; my $d = $p3[0] - $p4[0]; my $det = $a$b - $c$d; return 0 if (abs($det) < $delta); # parallel my $e = $p3[1]-$p1[1]; my $f = $p3[0]-$p1[0]; my $n1 = $f$b - $e$d my $n2 = $a$e - $c$f; if ($det > 0) { return $n1 < $det && $n2 < $det && $n1 > 0 && $n2 > 0; } else { return $n1 > $det && $n2 > $det && $n1 < 0 && $n2 < 0; } } [download]	[reply] [d/l]
Re^2: Millions of line segment intersection calcs: Looking for speed tips by drewhead (Beadle) on Aug 03, 2005 at 20:11 UTC
Thanks for all your suggestions. I've considered all of them and here's what I found. Why are $points and $neighborEdges hashrefs mapping integers (as strings) to values? Big picture implementation: what we are talking about is one method in a larger OO style pm I'm building. $points gets populated elsewhere and it's eaiser to pass this around as a ref. Obvioulsy this isn't apparent in my example. However I really have no reason to keep neighborEdges that way and can change it. Impact here was marginal if any. but it looks like you don't actually care about the real distance between points. Instead, you care about relative distances. So you can store the square of the distances (by not doing the sqrt) and get the same results. I do care about the actual distances, but only for those edges I identify as neighboors. So this point is quite valid, I don't really need to sqrt until after I identify these. This isn't going to impact things sine the caculation of the edges takes a second vs 17ish minutes for the intersect loop. Since Determinant is called so many times, I wonder whether you'd get any improvement over rewriting it without the temporary variables. Tested, the answer appears to be no. I'd probably try rethinking SegmentIntersection. Ahhh! I love coming here and having people see all these things that make perfect send yet never occured to me. Ofcourse! This only has marginal impact. The test data leaves only 2240 Determinant calls out... so 1120 get dropped early.	[reply]