rangersfan has asked for the wisdom of the Perl Monks concerning the following question:

Hello,

My problem is that I have a text file that looks like

today average1: 3.2 average2: 5.5 tomorrow average1: 3.0 average2: 5.0 so on and so on . . . .
and what I want to do is calculate all the average1's and all the average 2.

so I've written something that so far opens a file and does a regex search for

print $line if $line ~=/average1/
however I don't know the best way to manipulate this data. More importantly what i'm looking for is how would I just get it to print just the numbers so at the very least I could put them in array. Or is this the best method?

Replies are listed 'Best First'.
Re: seeking advise on average value problem
by graff (Chancellor) on Mar 26, 2006 at 00:06 UTC
    I think what you want is a hash-of-hashes data structure. The "outer" (main) hash would be keyed by the strings at the beginning of each line ("average1", "average2", etc). For each of those main hash elements, you have a sub-hash that stores the sum of values associated with the key string, and the number of times the key string occurred in the file.

    Something like this would load that sort of hash, and then print summary values for each primary hash key:

    my %stats; while (<>) { next unless ( /^(\w+):\s+([\d.]+)/ ); $stats{$1}{count}++; $stats{$1}{sum} += $2; } printf( "%20s %5s %8s\n", "Name", "Count", "Average" ); for ( sort keys %stats ) { printf( "%19s: %5d %8.2f\n", $_, $stats{$_}{count}, $stats{$_}{sum} / $stats{$_}{count} ); }
      this was more what I had in mind. Basically I have a log from another script that I would like to calculate different. It produces a couple differnt averages each hour on each day but they are all mixed and I wanted a way to sort out the mess of 5 weeks worth of data rather than go back to excel.
        what if I wanted to go a step further, and say that i have

        joe_average: 2.2 jack_average: 3.2 stan_average: 9.1 <p> next day joe_average: 5.2 jack_average: 4.2 stan_average: 6.1 <p> next day joe_average: 3.2 jack_average: 4.2 stan_average: 5.1
        How can I get somthing like this to work....

        my %stats; while (<FILE>) { next unless /^(jack_average)\w+:\s+(\d+\.\d+)$/; $stats{$1}{count}++; $stats{$1}{sum} += $2; } printf( "%20s %5s %8s\n", "Name", "Count", "Average" ); for ( sort keys %stats ) { printf( "%19s: %5d %8.2f\n", $_, $stats{$_}{count}, $stats{$_}{sum} / $stats{$_}{count} ); }
Re: seeking advise on average value problem
by johngg (Canon) on Mar 25, 2006 at 23:58 UTC
    I'm not sure what you mean by calculate, whether you want to average them all or something else. However, one way to gather them all up so you can do what you want with them is this.

    # We already have a filehandle open for reading. Read # file line by line, putting data into a hash table. # our %averages = (); while(<IN>) { # Pick out the "average" lines with a regular # expression using () round brackets to remember # the "average1 or 2" (in $1) and the value ($2). # next unless /^(average\d+):\s+(\d+\.\d+)$/; push @{$averages{$1}}, $2; }

    you end up with a hash table with keys "average1" and "average2" and the values being a list of the relevant values so that doing

    print $averages{average1}->[0], "\n"; print $averages{average2}->[1], "\n";

    Would produce

    3.2 5.0

    I hope this helps.

    Cheers,

    JohnGG