larryk has asked for the wisdom of the Perl Monks concerning the following question:

I figure there must be a module to do this. Anyone know what it is? I wrote my own in the mean time:
sub sum($) { my $listref = $_[0]; my $sum = 0; map { $sum += $_ } @{$listref}; return $sum; } sub avg($) { my $listref = $_[0]; sum($listref)/@{$listref}; } sub rms($) { my $listref = $_[0]; sqrt(sum([map{(avg($listref)-$_)**2}@{$listref}])/@{$listref}); }
Pls comment.

"Argument is futile - you will be ignorralated!"

Replies are listed 'Best First'.
Re: How do I get the root mean square of a list?
by davorg (Chancellor) on Jun 04, 2001 at 19:24 UTC

    Been a while since I worked with RMS values, but from what I remember your code looks a little overcomplex. What's wrong with this:

    sub rms { my $sq; $sq += $_ * $_ foreach @_; sqrt($sq/@_); }
Re: How do I get the root mean square of a list?
by MeowChow (Vicar) on Jun 04, 2001 at 19:40 UTC
    The formula you implemented is for calculating standard deviation, not RMS.
Re: How do I get the root mean square of a list?
by larryk (Friar) on Jun 04, 2001 at 20:02 UTC
    thanks guys - horribly inefficient incorrectly named "rms" sub now looks like
    sub std_dev($) { my $listref = $_[0]; my $avg = avg $listref; sqrt(sum([map{($avg-$_)*($avg-$_)}@{$listref}])/@{$listref}); }
    Can I improve on this?
      Pass-by-reference isn't winning you anything here, since you are explicitly dereferencing and iterating over the entire array anyway. I would rewrite as:
      sub std_dev { my ($sum, $avg, $variance, $t); $sum += $_ for @_; $avg = $sum / @_; $variance += ($t = ($avg - $_)) * $t for @_; $variance /= @_; sqrt $variance; }
      This benchmarks over twice as fast, mostly because it's not calling outside subs for summation and averaging. There's a little intermediate result optimization in there as well.

      I wonder, however, if there isn't a more efficient algorithm for calculating standard deviation, that doesn't require iterating twice... tilly?

         MeowChow                                   
                     s aamecha.s a..a\u$&owag.print
        You need to iterate twice. However the following version should be slightly faster and I think is numerically more stable:
        sub std_dev { my ($sum, $sqr_sum); for (@_) { $sum += $_; $sqr_sum += $_ * $_; } sqrt(($sqr_sum - $sum * $sum / @_)/@_); }
        And, of course, if your set of numbers is a sample set from a random distribution, the standard deviation of the numbers is a biased predictor of the true standard deviation, so you may prefer the following:
        # Produces an unbiased estimate of a standard deviation # from a sample sub std_dev_samp { my ($sum, $sqr_sum); for (@_) { $sum += $_; $sqr_sum += $_ * $_; } sqrt(($sqr_sum - $sum * $sum / @_)/$#_); }
        you have heard of PDL havent you? it handles array ops as fast as C within Perl. and it does have rms as part of its stats() function.