in reply to Calculating the average on a timeslice of data
First, I considered how to handle the fact that you are interested in only 120 dates. One way to do this would be to provide the desired dates as an input parameter - preferably in a file. The input file becomes another parameter.
Next, I considered how to calculate the average itself. Assuming the values won't result in integer overflow, the simplest method would be to track the sum and total entries. The average is just the quotient of the two. Here is the finished program (not tested).
#!/usr/bin/perl use strict; use warnings; use Getopt::Std; my %opt; get_args(\%opt); my $desired = load_date_file($opt{d}); open(my $fh, '<', $opt{i}) or die "Unable to open '$opt{i}' for readin +g: $!"; while (<$fh>) { chomp; my ($id, $date, $val) = split ' ', $_, 3; next if ! defined $date || ! exists $desired->{$date}; $desired->{$date}{sum} += $val; $desired->{$date}{cnt}++; } for my $date (sort keys %desired) { my $sum = $desired->{$date}{sum} || 0; my $cnt = $desired->{$date} || 0; my $avg = $cnt ? sprintf('%.2f', $sum / $cnt) : 0; print join(',', $date, $sum, $cnt, $avg), "\n"; } sub load_date_file { my ($file) = @_; my %desired; open(my $fh, '<', $file) or die "Unable to open '$file' for readin +g: $!"; while (<$fh>) { chomp; if (! /^\d{4}$/) { warn "'$_' is not in MMYY format - skipping\n"; next; } $desired{$_} = undef; } return \%desired; } sub get_args { my ($opt) = @_; my $Usage = qq{Usage: $0 -d <date_file> -i <input_file> -h : This help message -d : The (d)ate file -i : The (i)nput file } . "\n"; getopts('hd:i:', $opt) or die $Usage; die $Usage if $opt->{h} || ! defined $opt->{d} || ! defined $opt-> +{i}; }
Cheers - L~R
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re^2: Calculating the average on a timeslice of data
by perlbrother (Initiate) on Jul 07, 2011 at 15:39 UTC |