in reply to Funny addition behaviour

Several times below I abuse the word "digit" to refer to a single character in any number system, not just a base-ten one. The meaning should be clear from context.

To see what's really going on at a bit-wise level, you can look at this "one-liner":

perl -le 'print sprintf("%6.1f: ", $_), reverse(split //, unpack("b*", pack("F", $_))) for map {eval} @ARGV'

and give it the arguments:

33.3 33.4 '33.3+33.3' '33.4+33.3' '33.3+33.3+33.4' '33.4+33.3+33.3'

The unpack/pack combination gives you the floating point numbers as perl sees them ("F" means "what perl uses for floating point" - on my machine, this is a double, so I could have gotten the same results with "d"), and the reverse is just there because intel is a little-endian architecture, and (modern) human beings are used to reading numbers with the most significant digit on the left.

You get:

33.3:0100000001000000101001100110011001100110011001100110011001100110 33.4:0100000001000000101100110011001100110011001100110011001100110011 66.6:0100000001010000101001100110011001100110011001100110011001100110 66.7:0100000001010000101011001100110011001100110011001100110011001100 100.0:0100000001011001000000000000000000000000000000000000000000000000 100.0:0100000001011000111111111111111111111111111111111111111111111111

(some spaces manually trimmed to fit all on one line in perlmonks.org's code display)

Now, what this means is not too hard to decode; here's a rough translation: (numbers prefixed with a 'B' mean binary)

33.3 is represented as 2^5 times B1.0000101001100110011001100110011001100110011001100110 33.4 is represented as 2^5 times B1.0000101100110011001100110011001100110011001100110011 66.6 is represented as 2^6 times B1.0000101001100110011001100110011001100110011001100110 66.7 is represented as 2^6 times B1.0000101011001100110011001100110011001100110011001100 100.0 (exactly) is represented as 2^6 times B1.1001000000000000000000000000000000000000000000000000 100.0 (not quite) is represented as 2^6 times B1.1000111111111111111111111111111111111111111111111111

So now let's do the math step-by-step: ('B' omitted; you should now by now what stuff's binary and what stuff isn't)

33.3 + 33.3 is then 2^5 times: 1.0000101001100110011001100110011001100110011001100110 + 1.0000101001100110011001100110011001100110011001100110 --------------------------------------------------------- 10.0001010011001100110011001100110011001100110011001100 But that's too long for a double, so it's represented as 2^6 times 1.0000101001100110011001100110011001100110011001100110 33.4 + 33.3 is then 2^5 times: 1.0000101100110011001100110011001100110011001100110011 + 1.0000101001100110011001100110011001100110011001100110 --------------------------------------------------------- 10.0001010110011001100110011001100110011001100110011001 Now, smashing that into a double results in 2^6 times: 1.0000101011001100110011001100110011001100110011001100
So now we need to get tricky. The last two additions involve one addend that's represented as 1.something times 2^6, and another addend that's represented as 1.something times 2^5. Before adding them, the addend that's represented as 1.something times 2^5 has to be rewritten so that it's 0.1something times 2^6. Then the addition can proceed normally. Note that this rewriting may involve some rounding of the last few binary digits.
66.6 + 33.4 is now 2^6 times: 1.0000101001100110011001100110011001100110011001100110 + 0.1000010110011001100110011001100110011001100110011010 --------------------------------------------------------- 1.1001000000000000000000000000000000000000000000000000 66.7 + 33.3 is now 2^6 times: 1.0000101011001100110011001100110011001100110011001100 + 0.1000010100110011001100110011001100110011001100110011 --------------------------------------------------------- 1.1000111111111111111111111111111111111111111111111111

Now, if you're sharp-eyed, you'll notice that when we rounded the result of 33.4 + 33.3 to fit into a double, we rounded down (the last few digits went from "11001" to "1100"). However, when we were rewriting 33.4 so that it was something times 2^6, we rounded upwards (the last few digits went from "10011" to "1010").

This is a demonstration of the fact that my hardware uses a round-to-even strategy when rounding floating point numbers - if the digit being chopped off is a "1", it rounds to the nearest number that makes the rounded result end in "0". (when humans use this strategy on decimal numbers, it affects what they do when they're rounding off a "5" - in grade school, we were taught to always round up with a "5", but many places use the rule "when rounding off a 5, round either up or down so that the new last digit is even")

In most cases, round-to-even means that round-off errors tend to cancel each other out, so it really is generally the most sensible strategy. However, people who are heavily into numerical analysis will want the ability to tweak this behavior for certain applications, and in fact there are functions defined in the standard math.h C header to do just this.

Anyway, the error in general is that floating point math on a computer is done in binary, not in decimal, and some things which are exactly representable in decimal aren't exactly representable in binary, so roundoff errors can creep in where one might not expect them. (but not the other way around - anything representable exactly in binary is representable exactly in decimal, which makes me think that perhaps some computer should be invented that internally handles floating point stuff in base 210, since that way you get everything decimal can handle, and also 1/3 and 1/7. 1/11 still gives you trouble, but to handle that too you'd need more than one byte per digit, which might make things harder...)

Replies are listed 'Best First'.
Re: Re: Funny addition behaviour
by fizbin (Chaplain) on Apr 23, 2004 at 03:17 UTC
    Note that if the binary makes you dizzy and you much prefer a hex display of the internals, you can get a reasonable hex output by changing the top one-liner to:
    perl -le 'print sprintf("%6.1f: ", $_), reverse(split /(..)/, unpack("H*", pack("F", $_))) for map {eval} @ARGV'
    Assuming your perl uses the same size floating point numbers as mine, the first three hex digits encode the sign and exponent, and the last 13 hex digits encode the part of the mantissa after "1."