your data structures confuse me.. sorry, it's a late evening..
Storing everything in memory seems bad, if you only need mean, variance and such. Why don't you check Statistics::Descriptive. It allows you to save things sparsely ( ie, only their main statistical properties, rather than all the datapoints.. ).
update:Of course, it does that by doing arithmetic stuff on each add, but this should be negligible compared to the memory savings.. - it might slow you down, if you had 100 million rows, but think of what those would do to your memory..

Thus, when you parse your "Year" line, you know what variables you want the stats for - create one Statistics.. object for each, and from there on just add datapoints from each split.. also, I'd rather use an array - something like

#!/usr/bin/perl -lw use strict; use Statistics::Descriptive; my $filename='224_APID003_report.csv'; open (READ_IN,"<$filename") or die "I can't open $filename to read.\n" +; my @idxs; my %vars; my @names; while (<READ_IN>) { chomp; /^Year/ && do { @names = split /,/; @idxs = grep {$names[$_] !~ /TIME|YEAR/i } (0..@names-1); $vars{$_} = Statistics::Descriptive::Sparse->new() for @names[ +@idxs]; }; /^\d{4}/ && do { my @values = split /,/; $vars{$names[$_]}->add_data($values[$_]) for @idxs; }; } close READ_IN or die $!; foreach (keys %vars) { printf "%20s: mean = %10.4f, var = %10.4f\n",$_, $vars{$_}->mean() +, $vars{$_}->variance(); }

btw, this is not tested, I might be writing rubbish...

Update: Tested, corrected, prettyfied, added all the details. I hope it works for you. The other Sparse methods of Stats::Desc seem to cover everything you need..

update2: I am aware that the excessive use of $_ throughout this piece of code makes it more difficult to read than expanding all the loops. On the other hand, I think I'm way too attached to grep, map and inverse for, with their terseness... I might eventually write a meditation about Perl and Bulgarian language ..


In reply to Re: Efficient use of memory by ivancho
in thread Efficient use of memory by K_M_McMahon

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.