Category: utility
Author/Contact Info kbaucom@schizoid.com
Description: given a column of numbers, output the number of entries, total, min, max, average, median and mode.

use strict;

my $total = 0;
my $count = 0;
my %occur = ();
my @list;

if(defined(@ARGV) && $ARGV[0] eq "-f") { # fast mode, averages only
  shift(@ARGV);
  while(<>) {
    next unless(/^\s*[\d\.]+\s*$/);
    $total+=$_;
    $count++;
  }
  if($count) { printf "    Avg: %.3f\n", $total/$count; } 
  exit;
}

while(<>) {
  next unless(/^\s*[\d\.]+\s*$/);
  chomp($_);
  $occur{$_}++;
  push(@list, $_);
  $total+=$_;
}

unless(defined(@list)) { exit; }
$count = $#list +1;

my $mode = (sort {$occur{$b} <=> $occur{$a}} (keys %occur))[0];
my $mode_count = $occur{$mode};

my ($min,$median,$max) = (sort {$a <=> $b} @list)[0,int($count/2),-1];

printf "Entries: %d\n", $count;
printf "    Max: %.3f\n", $max;
printf "    Min: %.3f\n", $min;
printf "  Total: %.3f\n", $total;
printf "    Avg: %.3f\n", $total/$count;
printf " Median: %.3f\n", $median;
printf "   Mode: %.3f (%d occurances)\n", $mode, $mode_count;
Replies are listed 'Best First'.
Re: avg
by graff (Chancellor) on May 29, 2002 at 02:31 UTC
    THANK YOU! I've been wanting just this kind of utility for some time now, but just never got around to doing it. I'll be using it on a regular basis -- in fact, I'll supplement my copy to work/report on any number of columns in one run (e.g. in addition to the "-f" option, I could use something like "-c 1,3-5,e,-2" to get stats on columns 1, 3, 4, 5, the last column, and the column that's two over from the last one).
Re: avg
by robobunny (Friar) on May 29, 2002 at 17:39 UTC
    oops, the 'next if' line under fast mode should have been 'next unless'. it's fixed now, but if you were wondering why it didn't work with the -f flag, that's why. that's what i get for editing it right before i post it.