Beefy Boxes and Bandwidth Generously Provided by pair Networks
Come for the quick hacks, stay for the epiphanies.
 
PerlMonks  

Re: Sorting by geographical proximity / clumping groups of items based on X and Y

by Abigail-II (Bishop)
on Jul 18, 2002 at 13:10 UTC ( [id://182789]=note: print w/replies, xml ) Need Help??


in reply to Re: Re: Sorting by geographical proximity / clumping groups of items based on X and Y
in thread Sorting by geographical proximity / clumping groups of items based on X and Y

Despite what I just said about NP-completeness, the following algorithm might give a reasonable solution (certainly not optimal).
  1. Let C be the set of all complaints.
  2. For each complaint c in C find the set S(c) of all complaints d in C such that the distance between c and d is less than X (X is user defined).
  3. Find complaint e in C such that |S(e)| <= |S(f)| for all f in C; that is, find the complaint who has the most other complaints nearby. Pick a random one in case of a tie.
  4. Make a clump out of e and S(e).
  5. Remove all complaints g in S(e) from C. For all h remaining in C, remove from S(h) all g in S(e).
  6. If C is empty, we're done. Else, goto 3.
Some pseudo code:
# Get set of all complaints. my @C = get_all_complaints; # Find all the associated sets. my %D = map {my $c = $_; $c => {map {$_ => 1} grep {$c ne $_ && distance ($c, $_) < $X} @C}} @C; while (%D) { # Find complaint with the most nearby. my ($complaint, $size) = (undef, 0); while (my ($c, $set) = each %D) { ($complaint = $c, $size = keys %$set) if keys %$set > $size; } # Found largest, make a clump. make_clump ($complaint, @{$D {$complaint}}); # Delete largest from set. my $set = delete $D {$complaint}; # Delete associated set from set. delete @D {keys %$set}; # Delete associated set from associated sets. delete @{$_} {keys %$set} for values %D; }
The performance will be quadratic, I'm afraid.

Abigail

  • Comment on Re: Sorting by geographical proximity / clumping groups of items based on X and Y
  • Download Code

Replies are listed 'Best First'.
Re: Re: Sorting by geographical proximity / clumping groups of items based on X and Y
by t'mo (Pilgrim) on Jul 18, 2002 at 16:09 UTC
    Shouldn't step 3 in your algorithm description:
    |S(e)| <= |S(f)|
    read
    |S(e)| > |S(f)|
    ? (Not having been a Math major, of course, could lead me to misinterpret "|S(t)|", which here I'm interpreting as "the size of set t".)
    Anyway, I had intended to suggest an algorithm dealing with sorting based on the lengths of line segments, but I now realize that's probably not much use in this case (since vroom wants to find the clumps, not necessarily a point). And reading further down the page, I like Ntromda's idea, but I, like him/her, wouldn't know how to begin coding it (without thinking about it for a long while...).

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://182789]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others avoiding work at the Monastery: (5)
As of 2024-04-19 12:43 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found