Given an ordered array of numbers, how to sum values that fall below a minimum with the value of the nearest neighbor that satisfies the minimum, from both (1) index 0 to the nearest "good" neighbour, and (2) index "x" of the last "good" neighbour up to the end of the array?
For example, given an array of numbers: (0, 3, 8, 9, 5, 2, 0, 1, 0) - we want to transform it so it sums the same (28) but has only so many elements (indices) that all have values, say, greater than or equal to 5. So we have to reduce it to (11, 9, 8) - summing (0, 3, 8) = 11 (from index 0 to 2) into its first index, and (5, 2, 0, 1, 0) = 8 (from index 4 to index -1) into its last index.
This is a problem that happens in a statistical analysis of a frequency distribution where a statistical test does not handle values that fall below a certain value, but where the order of the values must be respected (as a vector, not just an array of numbers that can have any order). Conventional implementations of a chi-square test of independence/contingency work like this - no expected values less than 5 (e.g., Statistics::Gtest does not permit observed values equalling zero, which generally means no expected values less than 5; so if the observed values correspond to expected values less than 5, we want to sum both arrays, index-by-index, in the same way). Summing up, we want to "bin" values outside of this minimum into their nearest neighbour, within the nearest left- or right-side indices of the distribution that have values satisfying the minimum.
The following primitive "solution" appropriately sums from the left-side (from index 0), but misses what's going on the right-side (from index "x") (please forgive the "untidy" C-style for-ing):
my @bad_ari = (0, 3, 8, 9, 5, 2, 0, 2, 1); # will fail Statistics::Gte +st my $min = 5; # minimal "goodness" my ($new_i, @good_ari) = (0); # wishful thinking begins ... for (my $i = 0; $i < scalar @bad_ari - 1; $i++) { $good_ari[$new_i] += $bad_ari[$i]; # so far so good if ($bad_ari[$i] >= $min) { # trick to sum any bad counts into the + prior index $new_i++; } }
This results in @good_ari = (11, 9, 5, 4) - which is still "bad" - because we we want @good_ari = (11, 9, 10), i.e., the array: ( sum(0, 3, 8), 9, sum(5, 2, 0, 2, 1) ).
I've tried reversing the array after this type of figuring, but just get tied up in knots, or into witchcraft that works in one case but not another. I've turned to List::AllUtils to see if there's a summary method to handle or help with this type of thing ... Perhaps a lama might show me how its methods can yet be relied upon, or point me to another helpful module. Appreciating your lead to elementary truths, anyway.
Any help, kind admonition, appreciated.
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |