Re^4: Floating Point Errors: How and Why to Avoid Adding and Subtracting Varying Numbers (contriving)

That's a fairly contrived example.

Yes, indeed. It was carefully contrived in a fair manner to demonstrate the problem with the "better" code. (:

The better_average loses here because of compounded errors in float division.

And it appears that it demonstrated the problem quite successfully to you. Yes, you have described exactly why the "better" code is actually worse. Doing the division once eliminates the possibility of accumulating errors from doing the division over and over again (and is more efficient).

I tried using random numbers instead, and I found them more or less equivalent.

Yes, floating point is carefully designed so that in most cases it mostly "just works". It is usually required that one carefully contrive the example to demonstrate the weakness in a floating point calculation. And you need to pay attention to how far off things can get, not how often near-single-bit errors happen to benefit one method vs the other.

The first sentence of ikegami's is true: It can be good to avoid adding numbers of much different magnitudes. (It is even better to avoid getting a small-magnitude result by subtracting two, much larger numbers.) But his example doesn't appear to have anything to do with that advice.

A better example of that is adding up a huge list of numbers. You can get a more accurate total if you add up the small-magnitude numbers before adding in the big-magnitude numbers. Does that mean you should sort before adding up any list of numbers? Probably not. Most of the time it doesn't make much difference. But if you know you are going to need to sum up a few really large numbers and a ton of small numbers, then you certainly might want to save the big numbers for last.

I do some similar tricks in Math::BigApprox. I also do some tricks to deal with floating-point numbers that are nearly equal (the FAQest "gotcha" with floating-point calculations).

- tye

Comment on Re^4: Floating Point Errors: How and Why to Avoid Adding and Subtracting Varying Numbers (contriving)

Replies are listed 'Best First'.
Re^5: Floating Point Errors: How and Why to Avoid Adding and Subtracting Varying Numbers (contriving) by kyle (Abbot) on May 29, 2008 at 03:10 UTC
And it appears that it demonstrated the problem quite successfully to you. Indeed. Enlightenment achieved. I started my reply thinking that I'd find that in a more common (i.e., not contrived) case, ikegami's `better_average` really would be better. What I found instead is that in the more common case they're pretty much the same. Even though I did not show what I set out thinking I'd show, I posted my results anyway. Someone else who had the same thoughts I had can examine my results and my methods and go forward from there or at least not have to cover the same ground again. You seem to be saying that contrived examples are more important than a common case. That is, the very limit of inaccuracy is where to make a decision. I think for any given method, the more common the case is, the more important it is. For these two, with what's shown so far, `bad_average` looks better. If there were some less common but still plausible scenario where `better_average` wins, I'd want to go with that, even though it fails in a case I'll probably never see. I appreciate your thoughts on handling floats. As I said elsewhere, my own attitude toward them is mostly just fear and loathing left over since I've forgotten the details.	[reply] [d/l] [select]
Re^6: Floating Point Errors: How and Why to Avoid Adding and Subtracting Varying Numbers (contriving) by tye (Sage) on May 29, 2008 at 06:40 UTC
You seem to be saying that contrived examples are more important than a common case. No, that isn't really my point. Your random samples fail to be important on one level not because they are random (nor "common"), but because they don't show much potential for error. You had to resort to Math::BigFloat just to be able to detect the error at all. They are a bit useful in showing that floating point is pretty resilient. But the fact that a bunch of random cases don't demonstrate a problem isn't much reassurance to me that there is no problem. There are famous algorithms that work great on random data but fall apart if the initial input is sorted, for example. Countering a demonstration of this fact by claiming that sorting the data is "contrived" and then generating a bunch of random data and not finding a problem probably would leave many unconvinced. I'm quite unconvinced that leaving the potential to accumulate errors via repeated divisions isn't a real problem. Having a lot of repeated values in a data set is not something I consider "highly unlikely", and it is a perfect situation for the repeated divisions to be "off" in the same direction over and over and thus for much more inaccuracy to accumulate when the "better" code is used. And there are a lot of other ways in which data can be structured such that most or even all of the error terms in each little division computation end up having the same sign. Back to ikegami's motivation. If you have a large set of numbers that mostly all fall within some reasonably small range of values (not near zero), then computing the average by first adding up all of the numbers can lead you to a step where you are adding one small number (the last number to be summed) to a much bigger number (the sum of all of the other numbers). Such an operation will end up ignoring some of the least-significant digits of the value of this last number, thus providing the potential for a less accurate answer than might be had via some other scheme. For a short while, I thought one could avoid this by computing the average at each step, which made me think that perhaps this is what ikegami had meant to demonstrate. But that doesn't actually work (you'd still end up adding a small number to a big number toward the end of your calculations). Another strategy that might have merrit is to compute the sum as if doing it over a huge binary tree. First, go over the data summing adjacent pairs of items so you are left with about 1/2 has many numbers, each of which is, on average, twice as large. Repeat. Keep repeating until you just have two numbers, each representing the sum of roughly half of the original numbers. Add those two together and divide by how many samples you started with and you've got an average of a large set of numbers but without ever having to add a very small number to a very large number. That might be a "better" way. Given a non-contrived set of numbers, I doubt you would be able to tell. But contrive the right set of numbers and you can probably demonstrate how much such a convoluted way of summing can improve accuracy. I don't have such a contrived example, which is part of why I'm unconvinced. I certainly suspect that such is possible, which would mean that such examples can be contrived. If there were some less common but still plausible scenario where `better_average` wins, I'd want to go with that, even though it fails in a case I'll probably never see. Imaginary examples are worse than contrived examples. Thanks. - tye	[reply]