Please supply a rationale for better_average.
Update:I thought the code below would speak for itself, but here is my own rationale for posting it, and for questioning the better_average routine submitted above.
- The only difference between bad_average and better_average is that a division by a constant is moved inside the innermost loop.
- If this were handed to any of us as a homework problem in refactoring, would we not pull the division back out again?
- But numerical methods are for many of us an intimidating twilight zone in which paradoxical results hold true.
- Let us require at least one example in support of every numerical claim.
- I claim that better_average is worse than bad_average, and here follows my supporting example.
use strict;
use warnings;
sub bad_average {
my $acc = 0;
$acc += $_ for @_;
return $acc / @_;
}
sub better_average {
my $acc = 0;
$acc += $_/@_ for @_;
return $acc;
}
my @inputs = (1) x 65537; # 2**16 + 1
my $true_average = 1;
for my $adj (qw[bad better]) {
my $func = "${adj}_average";
no strict 'refs';
my $average = &$func( @inputs );
my $diff = $average - $true_average;
print
"Error using $func is ",
$diff? "a nonzero number near $diff" : 0,
"\n"
}
Your roundoff error may vary, but on my system I get
Error using bad_average is 0
Error using better_average is a nonzero number near 3.33066907387547e-
+015
| [reply] [d/l] [select] |
That's a fairly contrived example. The better_average loses here because of compounded errors in float division. I tried using random numbers instead, and I found them more or less equivalent.
(As an aside, it occurs to me that I could have written "( $cmp > 0 ) ? 'bad' : ( $cmp < 0 ) ? 'better' : 'tie';" as "('tie', 'bad', 'better')[$cmp]" if $cmp is 0, 1, or -1. That seems pretty obscure, though.)
I've fooled with the various knobs a bit, and what seems to be true is that "bad" and "better" win about evenly. I thought maybe one would be better for large numbers than the other, but I haven't found that either.
I should note that this is my first use of Math::BigFloat, so it's entirely possible my entire test is bogus. Still, I think this is a better test of each algorithm than pumping them full of ones.
| [reply] [d/l] [select] |
That's a fairly contrived example.
Yes, indeed. It was carefully contrived in a fair manner to demonstrate the problem with the "better" code. (:
The better_average loses here because of compounded errors in float division.
And it appears that it demonstrated the problem quite successfully to you. Yes, you have described exactly why the "better" code is actually worse. Doing the division once eliminates the possibility of accumulating errors from doing the division over and over again (and is more efficient).
I tried using random numbers instead, and I found them more or less equivalent.
Yes, floating point is carefully designed so that in most cases it mostly "just works". It is usually required that one carefully contrive the example to demonstrate the weakness in a floating point calculation. And you need to pay attention to how far off things can get, not how often near-single-bit errors happen to benefit one method vs the other.
The first sentence of ikegami's is true: It can be good to avoid adding numbers of much different magnitudes. (It is even better to avoid getting a small-magnitude result by subtracting two, much larger numbers.) But his example doesn't appear to have anything to do with that advice.
A better example of that is adding up a huge list of numbers. You can get a more accurate total if you add up the small-magnitude numbers before adding in the big-magnitude numbers. Does that mean you should sort before adding up any list of numbers? Probably not. Most of the time it doesn't make much difference. But if you know you are going to need to sum up a few really large numbers and a ton of small numbers, then you certainly might want to save the big numbers for last.
I do some similar tricks in Math::BigApprox. I also do some tricks to deal with floating-point numbers that are nearly equal (the FAQest "gotcha" with floating-point calculations).
| [reply] |
Sorry for the delay in responding. Like I said in my message to you, I wanted some time to write a proper reply. It's been a while since I've dealt with floats. Before I address your question, I want to address your snippet.
I claim that better_average is worse than bad_average, and here follows my supporting example.
You haven't shown that at all. Considering the difference in error of the two functions is 0.000000000000003 of the input, that makes them equally good or equally bad for that data set.
Even if it did show bad_average to be better, your data sample is so horrible that no conclusion can be drawn from the comparison. In a discussion about minimizing the errors of dealing with floating point numbers, you used integers.
Finally, you also haven't shown better_average is bad. Quite the opposite, an error of 0.000000000000003 of the input is rather good.
If this were handed to any of us as a homework problem in refactoring, would we not pull the division back out again?
When it comes to numerical methods, the order in which operations are done can have an impact on the results. For example, a*(b+c) is equal to a*b+a*c in the mathematical world, but it's not necessarily the case in the practical world. You'd be wrong to refactor such code without considering the real-world implications.
Please supply a rationale for better_average.
Because moving the division changes the magnitude of all the numbers being added by the same amount, better_average is no better than bad_average. It was a bad example. A better average might sort the numbers so that the small ones are processed first. (Beware of the signs...)
There's been some speculation as to what I was thinking. What follows explains it.
I said "better_average is no better than bad_average", but that would be more precise if you added "for floating point numbers". For fixed point numbers, bad_average would fail miserably.
I've been working with fixed point numbers because the target machine had a very limited amount of memory. Using fixed point saved memory since the exponent wasn't stored along with the number.
Adding lots of numbers together would result in an overflow when dealing with fixed point numbers. For example, consider a register that can hold 5 decimal digits. (Roughly 15 bits, but easier to visualize.)
40123 (/1000)
+ 50456 (/1000)
+ 20789 (/1000)
-------------
overflow
You could convert them to a different scaling first, but that would result in a loss of precision.
40123 (/1000) => 04012 (/100)
50456 (/1000) => + 05046 (/100)
20789 (/1000) => + 02079 (/100)
------------
11137 (/100) (111.370 instead of 111.368)
And if you were adding 1000 numbers,
40123 (/1000) => 00040 (/1)
50456 (/1000) => + 00050 (/1)
20789 (/1000) => + 00021 (/1)
... ...
----------
00111 (/1) (111.000 instead of 111.368)
By dividing first, you avoid the need to scale the numbers and therefore avoid the loss of precision.
So to answer your question, I mistakenly mixed fixed point and floating point techniques.
| [reply] [d/l] [select] |