rayv has asked for the wisdom of the Perl Monks concerning the following question:


I am trying to calculate row averages (excluding
0's) for all the rows in a CSV file except the first
column. The first column is an entitiy name and is
not used in the calculation of the average.My input CSV
file is as follows:

test.dat AAA01,1.45,0.42,1.54,1.49,1.47,1.36,1.81,0.47,1.8,0.55 ABA05,1.29,1.09,1.13,1.88,1.11,1.44,1.25,1.23,1.05,1.39 BCD06,4.58,4.24,3.87,3.9,4.13,2.04,3.34,7.6,3.58,1.26,7.45 DFG12,26,26.34,24.59,26.46,26.24,26.14,32.35,31.77,31.77 MJK82,8.27,13.23,7.73,8.85,9.15,13.95,0,0,0,0,0 POU45,3.07,3.14,2.97,3.28,21.65,54.23,3.16,3.02,3.26 RTY76,1.22,1.3,1.11,0.92,1.57,1.06,1.01,0.87,0.93 SDH45,15.38,12,22.32,23.3,19.74,46.42,2.06,1.7,2.17 WPL02,13.24,13.23,7.73,8.85,9.15,13.95,0,0,0,0,0


I have been able to calculate a column average for all
of the columns, except the 1st column. The code I am
using to accomplish this is as follows:

open my $CPUFILE, "<", $file or die "Unable to open sorted file $!"; my @col_total; my @col_count; my @col_average; while (<$CPUFILE>) { chomp; my @f = split /,/; shift @f; # remove the 1st column my $num_cols = scalar @f; for my $i (0 .. $num_cols-1) { my $val = $f[$i]; if ($val>0) { $col_count[$i]++; $col_total[$i] += $val; } } } for my $i (0 .. $#col_num) { $col_average[$i] = $col_total[$i]/$col_count[$i]; print "Column ", ($i + 2), " Average = $col_average[$i]\n"; } close $CPUFILE or die; my $total = 0; $total += $_ for @col_average; print "Average CPU Time for Column for all servers in cpuatest.dat fil +e is "; print $total/@col_average, "\n";


I am rather new to array processing and perl. Can
someone please tell me how to read the values of the
row into an array and then iterate over the array to
calculate an average (excluding 0's) for each row?

Replies are listed 'Best First'.
Re: Calculating Row Averages From a CSV File
by Velaki (Chaplain) on Aug 15, 2007 at 11:52 UTC

    Here's a nifty twist on it.

    #!/bin/perl use strict; use warnings; my ( $name, $avg ); while (<DATA>) { my @list; ( $name, @list ) = split /,/; my $sum = 0; $sum += $_ for @list; $avg = $sum / scalar @list; write; } format STDOUT= @<<<< @#.## $name,$avg . __DATA__ AAA01,1.45,0.42,1.54,1.49,1.47,1.36,1.81,0.47,1.8,0.55 ABA05,1.29,1.09,1.13,1.88,1.11,1.44,1.25,1.23,1.05,1.39 BCD06,4.58,4.24,3.87,3.9,4.13,2.04,3.34,7.6,3.58,1.26,7.45 DFG12,26,26.34,24.59,26.46,26.24,26.14,32.35,31.77,31.77 MJK82,8.27,13.23,7.73,8.85,9.15,13.95,0,0,0,0,0 POU45,3.07,3.14,2.97,3.28,21.65,54.23,3.16,3.02,3.26 RTY76,1.22,1.3,1.11,0.92,1.57,1.06,1.01,0.87,0.93 SDH45,15.38,12,22.32,23.3,19.74,46.42,2.06,1.7,2.17 WPL02,13.24,13.23,7.73,8.85,9.15,13.95,0,0,0,0,0

    The output you get is:

    AAA01 1.24 ABA05 1.29 BCD06 4.18 DFG12 27.96 MJK82 5.56 POU45 10.86 RTY76 1.11 SDH45 16.12 WPL02 6.01

    Hope this helped,
    -v.

    "Perl. There is no substitute."
Re: Calculating Row Averages From a CSV File
by FunkyMonk (Bishop) on Aug 15, 2007 at 11:44 UTC
    I am rather new to array processing and perl
    Everything you need to calculate row averages is in the code you just posted. Why not ask for help in understanding the script you've already got?

    Your program has an error btw.

    for my $i (0 .. $#col_num) {

    @col_num isn't defined anywhere in the program. If you include

    use strict; use warnings;

    at the beginning of the program, perl will tell you about such errors.


      FunkyMonk


      The reference to $#col_num was mistyped - the reference
      was actually $#col_count. I did use the strict and
      warnings pragmas, but as the reference was correct they
      did not come into play.


      I understand this code. What I don't understand is how
      to use the columns in each row to calculate an average
      for the row.

        Place this after shift @f.
        my $row_sum; @f = grep { $_ } @f; $row_sum += $_ for @f; print "Row average = ", $row_sum / @f, "\n";

        Given:

        AAA01,0,0,0,0,0,1 AAA02,0,0,0,0,1,2

        produces:

        Row average = 1 Row average = 1.5

        update: Fixed error pointed out here by rayv

Re: Calculating Row Averages From a CSV File
by Prof Vince (Friar) on Aug 15, 2007 at 14:08 UTC
    Or a good ol' one-liner :

    perl -Mstrict -MList::Util=sum -wlne 'my @a = grep $_, split /,/; print shift @a, ":", sum(@a) / @a"' test.dat
Re: Calculating Row Averages From a CSV File
by toolic (Bishop) on Aug 15, 2007 at 17:12 UTC