You have a clear and fairly simple (though not trivial) criterion that applies to a portion of a very large and complex data structure. One approach would be to design a relational table schema to store the full data structure; this could have one table with a record for each "road chunk", another table for each X,Y point referenced by a road chunk, and perhaps a third table that relates the chunk records to the point records. By structuring the data this way, it will be easier to identify and manipulate just the information that you need to solve the joining problem (in fact, you probably want yet another table to hold the "chunk-chain" findings).

In fact, just breaking the problem down into separate data structures (tables) this way might clarify what the algorithm needs to do, whether or not you actually end up using an RDBMS.

It may be sufficient just to have the set of X,Y points as a unified data structure, with info on which road-chunk(s) each point belongs to; joining the road chunks is now just a matter of determining, for each point, how many road chunks contain it, and whether two (or more?) chunks have a given point as a terminus.

In other words, you started with an array of chunks, with each chunk containing an array of points -- try making a different structure the other way around: an array of points, and each one cites one or more road-chunks that it is a part of. This would be easiest if the point array is actually a hash, keyed by the X,Y coordinates.

my %pointdata; my @chunkends; for ( my $i=0; $i<@data; $i++ ) # road-chunks { for ( my $j=0; $j<@{$data[$i]; $j++ ) # points in a chunk { my $key = join(",",$data[$i][$j]->{X},$data[$i][$j]->{Y}); push @{$pointdata{$key}}, sprintf("%5.5d:%5.5d",$i,$j); $chunkends[$i] = $key if ( $j == 0 ); } $chunkends[$i] .= "-$key"; } # Now the %pointdata array contains all the information # about junctures between chunks (these not necessarily # all end-point junctures -- perhaps some road chunks could # intersect at the mid-points?) Also, the @chunkends array # cites the %pointdata keys for the endpoints of each chunk; # if you only want to find end-point junctures, identify the # %pointdata keys with multi-element entries and grep for # each one among the strings in @chunkends.
None of that is tested, but I hope it might be helpful.

In reply to Re: AoA data merging by graff
in thread AoA data merging by jasonk

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.