Here's a quick example that will hopefully get you started.

#! /usr/bin/perl use strict ; use warnings ; my @complaints = ( { 'complaint_id' => 'a123', 'x' => '45.2', 'y' => '39.7' }, { 'complaint_id' => 'b456', 'x' => '79.3', 'y' => '42.0' }, { 'complaint_id' => 'c789', 'x' => '11.9', 'y' => '29.8' }, { 'complaint_id' => 'd863', 'x' => '95.3', 'y' => '17.2' }, { 'complaint_id' => 'e635', 'x' => '65.5', 'y' => '33.3' }, ) ; my ( $curr_x, $curr_y ) = ( 36.2, 47.3 ) ; my @sorted_by_proximity = map { $_->[1]->{'dist'} = $_->[0]; $_->[1] } sort { $a->[0] <=> $b->[0] } map { [ distance( $curr_x, $curr_y, $_->{'x'}, $_->{'y'} ), $_ ] } @complaints ; foreach ( @sorted_by_proximity ) { print $_->{'complaint_id'} . " is " . $_->{'dist'} . " away.\n" ; } sub distance { my ($x1, $y1, $x2, $y2 ) = @_ ; return sqrt( abs( $x2 - $x1 )**2 + abs( $y2 - $y1 )**2 ) ; }

This gives you an array ref of complaints, sorted closest-first. That way, you could take the first n complaints, or all the complaints until the distance is greater than x, etc.

Update: Thanks to dws and hossman both correct. Here's mv revised code which may be a little more useful.

#! /usr/bin/perl use strict ; use warnings ; # Define some rules and constants. my $CLUSTER_RADIUS = 10 ; # Max distance to be included in cluster. my $CENTER_SKIP = 5 ; # Distance between centers on cluster scans +. my $CLUSTER_COUNT = 4 ; # Minimum no. of complaints to make a clust +er. my $START_X = 0 ; my $END_X = 100 ; my $START_Y = 0 ; my $END_Y = 100 ; # Load the complaints. my @complaints = () ; while ( <DATA> ) { next if /^$/ ; my $hashref = {} ; ( $hashref->{'id'}, $hashref->{'x'}, $hashref->{'y'} ) = split ; push @complaints, $hashref ; } for ( my $curr_x = $START_X ; $curr_x <= $END_X ; $curr_x += $CENTER_SKIP ) { for ( my $curr_y = $START_Y ; $curr_y <= $END_Y ; $curr_y += $CENTER_SKIP ) { my @near = map { $_->[1]->{'dist'} = $_->[0] ; $_->[1] } grep { $_->[0] <= $CLUSTER_RADIUS } map { [ distance( $curr_x, $curr_y, $_->{'x'}, $_->{'y'} ), $_ ] } @complaints ; my $complaint_count = @near ; if ( $complaint_count >= $CLUSTER_COUNT ) { printf "%d complaints near (%d,%d).\n", $complaint_count, $curr_x, $curr_y ; } } } exit 0 ; #------------------------------------------------------- sub distance { my ($x1, $y1, $x2, $y2 ) = @_ ; return sqrt( ( $x2 - $x1 )**2 + ( $y2 - $y1 )**2 ) ; } __DATA__ a123 10.1 12.5 b124 12.5 8.6 c125 9.8 10.2 d213 24.6 19.5 e753 35.6 2.2 f854 76.5 46.7 g354 8.7 8.6

_______________
D a m n D i r t y A p e
Home Node | Email

In reply to Re: Sorting by geographical proximity / clumping groups of items based on X and Y by DamnDirtyApe
in thread Sorting by geographical proximity / clumping groups of items based on X and Y by vroom

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.