briglass has asked for the wisdom of the Perl Monks concerning the following question:

Hello,

I am in need of much help. I would like to open a text file that contains hundreds of lines of tab delimited numerical values (hundreds of values per line) between 0 and 255. I would like to take the mean average of all these values (so I guess the program would add up all the values and then divide them by the number of values per file). So the result for each text file would be a single value between 0 and 255.

I would like to perform this on a series of files and then create a single text file that holds all of these mean values, each on a new line:

i.e.,
230
198
194
6
201
57
115
...etc.

Also, this would need to be run with the perl included with Mac OS X.

Thanks to whoever is willing to help me take on this seemingly simple task (yet difficult for me in a language with which I am not familiar.)

Thanks again,
Brian

  • Comment on Read in values and average -- Perl for MacOS X

Replies are listed 'Best First'.
Re: Read in values and average -- Perl for MacOS X
by m-rau (Scribe) on Feb 14, 2005 at 21:42 UTC

    Hey,

    first, I did not test the code, second I know this can be accomplished much shorter. Actually, I'm thinking about a oneliner.

    Nevertheless, you are a newby and I suggest this code since it is much more readable.

    #!/usr/bin/perl my %data; opendir( DIR, "input/*" ); while( my $file = readdir( DIR )) { open( FILE, $file ); while ($line = <FILE>) { chomp( $line ); $data { $file } -> { sum } += $line; $data { $file } -> { count } ++; } close( FILE ); } closedir( DIR ); open( FILE, ">", "result.txt" ); foreach my $file (sort keys %data) { printf FILE "%1.2f\n", $data { $file } -> { sum } / $data { $file +} -> { count }; } close( FILE );

      Thank you very much for your response!

      -Brian

Re: Read in values and average -- Perl for MacOS X
by revdiablo (Prior) on Feb 14, 2005 at 21:45 UTC

    For a simple approach, you will want to:

    • open the output file
    • loop through the list of input files
      • open them
      • loop through their contents
    • average out their values
    • print that to the output file

    If you have more specific questions about the implementation, please ask them. We usually try to avoid giving out complete working solutions. [Of course, by the time I post this, someone might prove me wrong. But I'm willing to take that risk. :-)]

    Update: indeed, m-rau proved me wrong.

Re: Read in values and average -- Perl for MacOS X
by TedYoung (Deacon) on Feb 14, 2005 at 21:47 UTC

    Well, some examples of what you have tried and why it doesn't work would help us better understand what you need...

    Here is some code to get you started:

    # remember to use strict, warnings, etc open LOG, ">$logfile" or die $!; # You will need to update this: # go through each .txt file in DIR for my $file (<DIR/*.txt>) { # open the file open F, $filename or die $!; my ($total, $count) = @_; while (<F>) { $count ++; # the total number of samples $total += $_; # the running accumulator } close F; # remove the int if you want a decimal number. my $avg = int $total/$count; print LOG "$avg\n"; } close LOG;

    Ted Young

    ($$<<$$=>$$<=>$$<=$$>>$$) always returns 1. :-)
Re: Read in values and average -- Perl for MacOS X
by Animator (Hermit) on Feb 14, 2005 at 21:54 UTC

    An unreadable one-liner: (I'll probably get downvoted for this :( )

    perl -ple 'BEGIN { $/ = qq(\t); $x=0; @l=(); $y=1;} $l[$x]+=$_, $y++ for (split /\n/); } continue { $l[$x] /= $y, $y = 1, $x++ if eof; } for (@l) { ' *.txt > file.list

    *.txt is the pattern of the input files, file.list is the output file. (you might need to change the single quotes to double quotes)

    People that try to figure this one out: take a look at `perldoc perlrun` and `perldoc -f eof`

    Update, removed 'close ARGV,' since it had no effect. (in my first attempt I used $., which is why it was there)

Re: Read in values and average -- Perl for MacOS X
by sh1tn (Priest) on Feb 14, 2005 at 22:10 UTC
    untested code:
    my @files; my (@l_count, $l_count); my (@f_count, $f_count); my @all; grep{ -f and push @files, $_ }glob '*'; for( @files ){ open FH, $_ or die $!; while( <FH> ){ push @l_count, $_ for split '\s+', $_; for( @l_count ){ $l_count += $_ for @l_count } @l_count = (); push @f_count, ( $l_count / $#l_count ); $l_count = 0; } close FH; $f_count += $_ for @f_count; @f_count = (); push @all, ( $f_count / $#f_count ); $f_count = 0; } open FH, '>end_res.log' or die $!; print FH $_, $/ for @all; close FH;

    I hope this more or less clear logic helps.

      sh1tn-

      Thanks for the code-- It seems to work perfectly, except for the fact that the resulting mean value seems to be (sum * count) instead of (sum / count). I tried fiddling with the code to get it to output the ratio instead of the product, but to no avail.

      For example, a file with the following comma delimited values:

      1 1 1

      outputs:

      9

      (1+1+1 * 3)

      Any ideas?

      Thanks,
      Brian

        Please, excuse my lack ot attention.
        my @files; my (@l_count, $l_count); my (@f_count, $f_count); my @all; my $res_file = 'end_res.log'; -f $res_file and unlink $res_file; grep{ -f and push @files, $_ }glob '*'; for( @files ){ open FH, $_ or die $!; while( <FH> ){ /(?:\d+\s*\d*)/ or next; push @l_count, $_ for split '\s+', $_; $l_count += $_ for @l_count; push @f_count, ( $l_count / ($#l_count + 1) ); $l_count = 0; @l_count = (); } close FH; $f_count += $_ for @f_count; push @all, ( $f_count / ($#f_count + 1) ); $f_count = 0; @f_count = (); } open FH, ">$res_file" or die $!; print FH $_, $/ for @all; close FH;