in reply to Grouping numbers
The following code seems to do what you are after, although a module solution may be more appropriate.
use warnings; use strict; use constant kMaxDelta => 0.15; my @rawData = (<DATA>); chomp @rawData; @rawData = sort {$a <=> $b} @rawData; my @groups; my @currGroup = (shift @rawData); for (@rawData, undef) { if (defined $_ and abs ($_ - sum (@currGroup, $_) / (@currGroup + +1)) <= kMaxDelta) { # Ok to add to current group push @currGroup, $_; } else { # time for a new group push @groups, [@currGroup]; @currGroup = ($_); } } my $count = 1; for my $group (@groups) { if (@$group == 1) { print "Outlier: $group->[0]\n\n"; } else { my $avg = sum (@$group) / @$group; my @outliers; my @ok; abs ($avg - $_) > kMaxDelta ? push @outliers, $_ : push @ok, $ +_ for @$group; printf "group$count: Average: %.2f\n", $avg; printf "$_ diff=%.2f\n", abs ($avg - $_) for (@ok); print "\n"; ++$count; printf "Outlier: $_ since dif=%.2f>X\n\n", abs ($avg - $_) for + @outliers; } } sub sum { my $sum = shift; $sum += $_ for @_; return $sum; } __DATA__ 100.20 100.23 100.35 122.43 122.55 122.67 122.75 145.88 145.97 146.01 146.10
Prints:
group1: Average: 100.26 100.20 diff=0.06 100.23 diff=0.03 100.35 diff=0.09 group2: Average: 122.60 122.55 diff=0.05 122.67 diff=0.07 122.75 diff=0.15 Outlier: 122.43 since dif=0.17>X group3: Average: 145.99 145.88 diff=0.11 145.97 diff=0.02 146.01 diff=0.02 146.10 diff=0.11
Note that the results you gave are inconsistent with the data you gave. I have altered 122.45 to 122.43 to give the same outlier result but that has altered the deltas for the second group.
|
|---|