Re: Funny addition behaviour

Several times below I abuse the word "digit" to refer to a single character in any number system, not just a base-ten one. The meaning should be clear from context.

To see what's really going on at a bit-wise level, you can look at this "one-liner":

perl -le 'print sprintf("%6.1f: ", $_),
                reverse(split //, 
                  unpack("b*", pack("F", $_))) 
          for map {eval} @ARGV'
[download]

and give it the arguments:

33.3 33.4 '33.3+33.3' '33.4+33.3' '33.3+33.3+33.4' '33.4+33.3+33.3'
[download]

The unpack/pack combination gives you the floating point numbers as perl sees them ("F" means "what perl uses for floating point" - on my machine, this is a double, so I could have gotten the same results with "d"), and the reverse is just there because intel is a little-endian architecture, and (modern) human beings are used to reading numbers with the most significant digit on the left.

You get:

 33.3:0100000001000000101001100110011001100110011001100110011001100110
 33.4:0100000001000000101100110011001100110011001100110011001100110011
 66.6:0100000001010000101001100110011001100110011001100110011001100110
 66.7:0100000001010000101011001100110011001100110011001100110011001100
100.0:0100000001011001000000000000000000000000000000000000000000000000
100.0:0100000001011000111111111111111111111111111111111111111111111111
[download]

(some spaces manually trimmed to fit all on one line in perlmonks.org's code display)

Now, what this means is not too hard to decode; here's a rough translation: (numbers prefixed with a 'B' mean binary)

33.3 is represented as 2^5 times
  B1.0000101001100110011001100110011001100110011001100110
33.4 is represented as 2^5 times
  B1.0000101100110011001100110011001100110011001100110011
66.6 is represented as 2^6 times
  B1.0000101001100110011001100110011001100110011001100110
66.7 is represented as 2^6 times
  B1.0000101011001100110011001100110011001100110011001100
100.0 (exactly) is represented as 2^6 times
  B1.1001000000000000000000000000000000000000000000000000
100.0 (not quite) is represented as 2^6 times
  B1.1000111111111111111111111111111111111111111111111111
[download]

So now let's do the math step-by-step: ('B' omitted; you should now by now what stuff's binary and what stuff isn't)

33.3 + 33.3 is then 2^5 times:
   1.0000101001100110011001100110011001100110011001100110
+  1.0000101001100110011001100110011001100110011001100110
---------------------------------------------------------
  10.0001010011001100110011001100110011001100110011001100

But that's too long for a double, so it's represented
as 2^6 times
   1.0000101001100110011001100110011001100110011001100110


33.4 + 33.3 is then 2^5 times:
   1.0000101100110011001100110011001100110011001100110011
+  1.0000101001100110011001100110011001100110011001100110
---------------------------------------------------------
  10.0001010110011001100110011001100110011001100110011001

Now, smashing that into a double results in 2^6 times:
   1.0000101011001100110011001100110011001100110011001100
[download]

So now we need to get tricky. The last two additions involve one addend that's represented as 1.something times 2^6, and another addend that's represented as 1.something times 2^5. Before adding them, the addend that's represented as 1.something times 2^5 has to be rewritten so that it's 0.1something times 2^6. Then the addition can proceed normally. Note that this rewriting may involve some rounding of the last few binary digits.

66.6 + 33.4 is now 2^6 times:
   1.0000101001100110011001100110011001100110011001100110
+  0.1000010110011001100110011001100110011001100110011010
---------------------------------------------------------
   1.1001000000000000000000000000000000000000000000000000

66.7 + 33.3 is now 2^6 times:
   1.0000101011001100110011001100110011001100110011001100
+  0.1000010100110011001100110011001100110011001100110011
---------------------------------------------------------
   1.1000111111111111111111111111111111111111111111111111
[download]

Now, if you're sharp-eyed, you'll notice that when we rounded the result of 33.4 + 33.3 to fit into a double, we rounded down (the last few digits went from "11001" to "1100"). However, when we were rewriting 33.4 so that it was something times 2^6, we rounded upwards (the last few digits went from "10011" to "1010").

This is a demonstration of the fact that my hardware uses a round-to-even strategy when rounding floating point numbers - if the digit being chopped off is a "1", it rounds to the nearest number that makes the rounded result end in "0". (when humans use this strategy on decimal numbers, it affects what they do when they're rounding off a "5" - in grade school, we were taught to always round up with a "5", but many places use the rule "when rounding off a 5, round either up or down so that the new last digit is even")

In most cases, round-to-even means that round-off errors tend to cancel each other out, so it really is generally the most sensible strategy. However, people who are heavily into numerical analysis will want the ability to tweak this behavior for certain applications, and in fact there are functions defined in the standard math.h C header to do just this.

Anyway, the error in general is that floating point math on a computer is done in binary, not in decimal, and some things which are exactly representable in decimal aren't exactly representable in binary, so roundoff errors can creep in where one might not expect them. (but not the other way around - anything representable exactly in binary is representable exactly in decimal, which makes me think that perhaps some computer should be invented that internally handles floating point stuff in base 210, since that way you get everything decimal can handle, and also 1/3 and 1/7. 1/11 still gives you trouble, but to handle that too you'd need more than one byte per digit, which might make things harder...)

Comment on Re: Funny addition behaviour Select or Download Code