in reply to Picking the best points
I'd partition the data into a number of buckets that match the number of points you want to end up with, then select the best value from each bucket according to whatever weighting is appropriate. Consider:
#!/usr/bin/perl use strict; use warnings; my $numBuckets = 20; my @points = map {{x => $_->[0], 'y' => $_->[1], dy => $_->[2]}} map {chomp; [split]} <DATA>; my @buckets; my $min = $points[0]{x}; my $max = $points[0]{x}; for my $point (@points) { $min = $point->{x} if $min > $point->{x}; $max = $point->{x} if $max < $point->{x}; } my $scale = ($max - $min) / $numBuckets; push @{$buckets[($_->{x} - $min) / $scale]}, $_ for @points; for my $bucket (@buckets) { # Sort contents of bucket by weighting function next if !defined $bucket; @$bucket = sort {$a->{dy} <=> $b->{dy}} @$bucket; } for my $index (0 .. $numBuckets - 1) { printf "%3d: ", $index; printf "%.4f, %.4f, %.4f", @{$buckets[$index][0]}{qw(x y dy)} if defined $buckets[$index]; print "\n"; }
using the data in the OP prints:
0: 0.0345, 0.9916, 0.0013 1: 0.0499, 0.9876, 0.0011 2: 0.1340, 0.9659, 0.0012 3: 0.1635, 0.9578, 0.0011 4: 0.2149, 0.9412, 0.0047 5: 0.2911, 0.9215, 0.0015 6: 0.2974, 0.9186, 0.0010 7: 0.3617, 0.8983, 0.0018 8: 0.4183, 0.8819, 0.0010 9: 0.4535, 0.8672, 0.0085 10: 0.5317, 0.8421, 0.0010 11: 0.5689, 0.8306, 0.0040 12: 0.5995, 0.8179, 0.0056 13: 14: 15: 16: 0.8015, 0.7142, 0.0249 17: 0.8540, 0.6901, 0.0060 18: 0.9126, 0.6475, 0.0020 19: 0.9690, 0.5879, 0.0023
which has drawn too few points because the actual distribution in the original data is very lumpy. If you need a fixed number of points and it is likely that you won't get at least one datum in each bucket, then I'd select further points from the buckets with the greatest number of points in them.
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re^2: Picking the best points
by kennethk (Abbot) on Oct 29, 2010 at 16:50 UTC |