in reply to Perl hit counter for multiple ranges, optimization or options in other languages?

G'day SaktiMonk,

Welcome to the monastery.

Update: My original solution (now within <spoiler> tags below) didn't take into account uneven sized ranges. I didn't notice that but ++Laurent_R did (see below). Here's a new version that still uses an array (@bins) for the range counts; caches the results of determining the range; and still only uses one iteration per range to output the bin counts.

$ perl -Mstrict -Mwarnings -e ' use List::Util qw{first}; my @input = qw{12 14 15 20 21}; my @range_ends = qw{19 29 39}; my (@bins, %range_index_cache); sub get_range_index { ( first { $_[0] <= $range_ends[$_] } 0 .. $#range_ends ) // @r +ange_ends; } for (@input) { ++$bins[$range_index_cache{$_} //= get_range_index($_)]; } for (0 .. $#range_ends) { printf "%d-%d %d\n" => $_ ? $range_ends[$_ - 1] + 1 : 1, $range_ends[$_], $bins[$ +_] // 0; } ' 1-19 3 20-29 2 30-39 0

Any datapoints beyond the ranges you specify are counted in the last element of @bins.

Here's a solution that only has one iteration per datapoint to populate the bins; and one iteration per range to output the bin counts.

$ perl -Mstrict -Mwarnings -e ' my @input = qw{12 14 15 20 21}; my $ranges = 3; my @bins; for (@input) { ++$bins[$_ / 20]; } for (0 .. $ranges - 1) { printf "%d-%d %d\n" => $_ * 20 || 1, $_ * 20 + 19, $bins[$_] / +/ 0; } ' 1-19 3 20-39 2 40-59 0

[Note: If you're using an old version of Perl that doesn't have the // (Logical Defined-Or) operator, you'll need to use the defined function.]

-- Ken

Replies are listed 'Best First'.
Re^2: Perl hit counter for multiple ranges, optimization or options in other languages?
by Laurent_R (Canon) on Aug 08, 2013 at 13:16 UTC

    I also thought about this possible solution, which would be very fast (and easy), but that works only if the bins have an equal size making it possible to compute the right bin by a mathematical formula. But, from the OP's sample data, the bins don't have all the same size.

      ++ Thanks for spotting that. I've updated my node.

      -- Ken