in reply to Efficient use of memory
Thus, when you parse your "Year" line, you know what variables you want the stats for - create one Statistics.. object for each, and from there on just add datapoints from each split.. also, I'd rather use an array - something like
#!/usr/bin/perl -lw use strict; use Statistics::Descriptive; my $filename='224_APID003_report.csv'; open (READ_IN,"<$filename") or die "I can't open $filename to read.\n" +; my @idxs; my %vars; my @names; while (<READ_IN>) { chomp; /^Year/ && do { @names = split /,/; @idxs = grep {$names[$_] !~ /TIME|YEAR/i } (0..@names-1); $vars{$_} = Statistics::Descriptive::Sparse->new() for @names[ +@idxs]; }; /^\d{4}/ && do { my @values = split /,/; $vars{$names[$_]}->add_data($values[$_]) for @idxs; }; } close READ_IN or die $!; foreach (keys %vars) { printf "%20s: mean = %10.4f, var = %10.4f\n",$_, $vars{$_}->mean() +, $vars{$_}->variance(); }
btw, this is not tested, I might be writing rubbish...
Update: Tested, corrected, prettyfied, added all the details. I hope it works for you. The other Sparse methods of Stats::Desc seem to cover everything you need..update2: I am aware that the excessive use of $_ throughout this piece of code makes it more difficult to read than expanding all the loops. On the other hand, I think I'm way too attached to grep, map and inverse for, with their terseness... I might eventually write a meditation about Perl and Bulgarian language ..
|
|---|