I'm a bit reluctant to go into this because implementing this would be
a rather complex programming challenge. But I'll explain it as best
I can in hopes that someone will find it helpful, educational, or at
least interesting. (:
When I previously had to find overlaps between two sets of shapes,
I certainly couldn't wait to compare against every single shape in
the database. So I had to come up with an index on the table that
would allow me to quickly eliminate almost all of the items in the
database as not being nearly close enough to have to worry about.
The problem with a database index is that it is one-dimensional.
I had previously had to deal with finding the nearest point (not
overlapping shapes) and for that I really needed a two-dimensional
key. But after some work I figured out that I could do pretty good
by sorting the points into parallel "bands" that would define a linear
(one-dimensional) ordering of all the points and allow me to build an
index.
So, if I made the bands W units thick and horizontal and wanted to find
all point within 2*W units of some point, then I might end up doing a
keyed search for records falling into the following 5 sections of bands:
----------------------------
[ ]
----------------------------
[ ]
----------------------------
[ . ]
----------------------------
[ ]
----------------------------
[ ]
----------------------------
For each record, I built a key composed of int(Y/W) then X and then indexed
on that compound key.
For finding overlapping shapes, things get a bit more complicated.
You can key each shape based on it's center. But you'd have to be able
to assert that each shape is no larger than some maximum size or else you
couldn't restrict your search.
Based on your problem description, I think you might be able to just
figure out your maximum radius and index your points into "shafts" and
have a pretty fast solution. That is, for each point in the smaller
set (or in the larger set if this set is used over and over with few
changes against a bunch of different smaller sets) you build a key from the
following values (in the order given):
int(X/W), int(Y/W), Z
where W is the width of your shafts.
Then, if you need to check a sphere centered at (X1,Y1,Z1) and with radius R1,
you need to look for all spheres in the database where their center is
within R1+M of (X1,Y1,Z1). So loop over all int(X/W) between X1-R1-M and
X1+R1+M:
for my $xBand ( int(($X1-$R1-$M)/$W) .. int(($X1-$R1-$M)/$W) ) {
Inside there, loop over all possible Y band coordinates. For simplicity
you can just use:
for my $yBand ( int(($Y1-$R1-$M)/W) .. int(($Y1-$R1-$M)/$W) ) {
or you can do a little math and narrow down that range just a bit
-- in a shaft that is already pretty far from (X1,Y1,Z1), you don't
have to worry about Z being all the way out at Z-R1-M or Z+R1+M since
that would be farther than R1+M from (X1,Y1,Z1). But we'll keep it
simple in hopes of avoiding some bugs in what will already be fairly
complex code.
Now you are looping over all shafts that contain centers within R1+M
of the current center of interest. So, for each shaft, look for centers
in the range. We'll again not optimize this for the sake of simplicity:
my $start= BuildKey( $xBand, $yBand, $Z1-$R1-$M );
my $end= BuildKey( $xBand, $yBand, $Z1+$R1+$M );
... "select * from shapes where key between $start and $end"
my $record;
while( $record= Fetch( ... ) ) {
if( Overlap( $record, ... ) ) {
Report( ... );
return 1;
}
}
}
}
return 0;
But, what if you can't pick a reasonable maximum radius? Well, you just
segregate shapes based on what range their radius is in and add a loop
around all of the above code that iterates over each radius range.
For example, you might realize that most of your radii are <= 2, a
fraction of them are between 2 and 10, and a very few are over 10. In
this case you might build a key as follows:
my @bandWidth= ( 3, 15, 0 );
sub BuildKey {
my( $radius, $x, $y, $z )= @_;
my $sizeRange=
$radius <= 2 ? 0
: $radius <= 10 ? 1
: 2;
my $width= $bandWidth[$sizeRange];
if( 0 == $width ) {
return ConcatForKey( $sizeRange, 0, 0, 0 );
}
return ConcatForKey(
$sizeRange, int($x/$width), int($y/$width), $z );
}
Exactly what ConcatForKey() does as well as how to select and Fetch()
depends on what you want to use for a database. You could use a standard
SQL database and have records that look something like:
real x
real y
real z
real radius
int size
int xBand
int yBand
index size,xBand,yBand,z
where you'll have to pick exactly which datatypes to use such that all
of your points fit.
Or you chould use a DBM file in which case you (pretty much) need to express
each key as a string (of bytes) such that normal "ASCIIbetical" ordering
(as done by cmp) sorts properly.
Assuming that your X and Y coordinates never go beyond -2147483647*$Width
nor 2147483647*$Width, then you can put big-endian longs
into your key string and get a proper sort order. However, you have to
shift the values so that negative numbers sort before positive numbers.
We'll even put the Z values into "bands" so we can use the same strategy
for putting them in the key that we do for the X and Y band numbers:
sub BuildKey {
my( $radius, $x, $y, $z )= @_;
my $sizeRange=
$radius <= 2 ? 0
: $radius <= 10 ? 1
: 2;
my $width= $bandWidth[$sizeRange];
if( 0 == $width ) {
return pack "c", $sizeRange;
}
for my $c ( $x, $y, $z ) {
$c= int($c/$width);
$c ^= 0x8000_0000;
}
return return pack "cNNN", $sizeRange, $x, $y, $z;
}
Note that this assumes 4-byte integers. If you have a version of Perl built with 8-byte integer support, then you'll probably have to change the ^= line. If you want to use 8-byte integers in your keys, then you'll also have to adjust the pack expression.
Once you have this ASCIIbetical key, you can also decide to just hold the
records in memory and, instead of letting a database use b-trees (or similar)
to find interesting records, just use a binary search yourself. And if you
go that route, it then becomes pretty easy to move much of the searching
code into Inline::C to make it quite a bit faster (since it does
quite a bit of fairly simple operations including arithmatic).
- tye |