I am a fool, and strictly this problem is not perl, but lack of statistical knowledge. I apologise for that... I present the following simplest of problems:
I wish to calculate the ratio po:fr for this. If it exceeds 15:1, make some printf noise. My current solution is:vmstat 1 10 extract: po fr 0 0 0 0 150 10 0 0 0 0
my $tlsamples = @series_po; # = @series_fr return 0 if ($tlsamples == 0); my $sum_po = sum(\@series_po); # = 150 my $sum_fr = sum(\@series_fr); # = 10 $sum_fr = 1 if ($sum_fr == 0); my $avg_po = $sum_po / $tlsamples; # =150 / 5 = 30 my $avg_fr = $sum_fr / $tlsamples; # = 10 / 5 = 2 $avg_fr = 1 (if $avg_fr == 0); # avoid div/0 my $pofr = $avg_po / $avg_fr; # = 15
This result of 15:1, is the same as for the following series:
po fr 150 10
The problem is, I need the zeroes to be significant in the first series, since they are. A single value spike should not be able to cause an alert, given many other zero values! (where 0 = no activity in vmstat context)
I have zero (pun intended) statistical background. I have thought of substituting each zero value to its nearest least-signicant alternative e.g.
po fr 150 10 1 1 1 1 1 1 1 1
In this case the ratio works out to (154/5) / (14/5) = 11. Is there a correct statistical perl-friendly approach that provides significance to the zeroes in the series?
Niel
In reply to A lesson in statistics by 0xbeef
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |