Fixed Point Numbers

You can read the following more easily if you go follow this link
I don't know of any sources, but here's some background.

Let's use smaller numbers to demonstrate. Let's say we have 4 bit numb
+ers.

   +---+---+---+---+
   |   |   |   |   |
   +---+---+---+---+

As a little refresher, let's forget decimal numbers for a second.

   +---+---+---+---+
   | 0 | 1 | 1 | 1 |  7 = 0*2^3 + 1*2^2 + 1*2^1 + 0*2^0
   +---+---+---+---+

I'm going to call that B4, since 4 bits are used by the integer portio
+n of the number.

   +---+---+---+---+
   | 0 | 1 | 1 | 1 |  7 B4
   +---+---+---+---+

So how would you store a decimal number? Let's take 1.75, for example.

   +---+---+---+---+   +   +
   | 0 | 0 | 0 | 1 | 1   1    1.75 = 1*2^0 + 1*2^(-1) + 1*2^(-2)
   +---+---+---+---+   +   +

Unfortunately, I can't just add bits to my register.
What if we shifted the bits?

   +---+---+---+---+
   | 0 | 1 | 1 | 1 |  1.75 * 2^2 = [ 1*2^0 + 1*2^(-1) + 1*2^(-2) ] * 2
+^2
   +---+---+---+---+

Another way of phrasing that is in terms of number of integer bits lef
+t.

   +---+---+---+---+
   | 0 | 1 | 1 | 1 |  1.75 B2 = [ 1*2^0 + 1*2^(-1) + 1*2^(-2) ] * 2^(4
+-2)
   +---+---+---+---+

In other words, using smaller B-scalings increases precision:
The following show the precision for a given scaling.

   +---+---+---+---+
   | 0 | 0 | 0 | 1 |  [ 1*2^0    ] * 2^(4-4) = 1 B4
   +---+---+---+---+

   +---+---+---+---+
   | 0 | 0 | 0 | 1 |  [ 1*2^(-1) ] * 2^(4-3) = 0.5 B3
   +---+---+---+---+

   +---+---+---+---+
   | 0 | 0 | 0 | 1 |  [ 1*2^(-2) ] * 2^(4-2) = 0.25 B2
   +---+---+---+---+

   +---+---+---+---+
   | 0 | 0 | 0 | 1 |  [ 1*2^(-3) ] * 2^(4-1) = 0.125 B1
   +---+---+---+---+

   +---+---+---+---+
   | 0 | 0 | 0 | 1 |  [ 1*2^(-4) ] * 2^(4-0) = 0.0625 B0
   +---+---+---+---+

Nothing says we have to stop at 0.

   +---+---+---+---+
   | 0 | 0 | 0 | 1 |  [ 1*2^(-5) ] * 2^(4--1) = 0.03125 B-1
   +---+---+---+---+

The downside is that biggest number we can represent goes down
as the precision increases.

   +---+---+---+---+
   | 1 | 1 | 1 | 1 |  [ 1*2^3    + 1*2^2    + 1*2^1    + 1*2^0    ] * 
+2^(4-4) = 15 B4
   +---+---+---+---+

   +---+---+---+---+
   | 1 | 1 | 1 | 1 |  [ 1*2^2    + 1*2^1    + 1*2^0    + 1*2^(-1) ] * 
+2^(4-3) = 7.5 B3
   +---+---+---+---+

   +---+---+---+---+
   | 1 | 1 | 1 | 1 |  [ 1*2^1    + 1*2^0    + 1*2^(-1) + 1*2^(-2) ] * 
+2^(4-2) = 3.75 B2
   +---+---+---+---+

   +---+---+---+---+
   | 1 | 1 | 1 | 1 |  [ 1*2^0    + 1*2^(-1) + 1*2^(-2) + 1*2^(-3) ] * 
+2^(4-1) = 1.875 B1
   +---+---+---+---+

   +---+---+---+---+
   | 1 | 1 | 1 | 1 |  [ 1*2^(-1) + 1*2^(-2) + 1*2^(-3) + 1*2^(-4) ] * 
+2^(4-0) = 0.9375 B0
   +---+---+---+---+

   +---+---+---+---+
   | 1 | 1 | 1 | 1 |  [ 1*2^(-2) + 1*2^(-3) + 1*2^(-4) + 1*2^(-5) ] * 
+2^(4--1) = 0.46875 B-1
   +---+---+---+---+

We can go the other way, if need be.

   +---+---+---+---+
   | 1 | 1 | 1 | 1 |  [ 1*2^4    + 1*2^3    + 1*2^2    + 1*2^1    ] * 
+2^(4-5) = 30 B5
   +---+---+---+---+

   +---+---+---+---+
   | 0 | 0 | 0 | 1 |  [ 1*2^1    ] * 2^(4-5) = 2 B5
   +---+---+---+---+

<blockquote><i>How does multiplying by 4294967296 not include 1 but mu
+ltiplying by 2147483648 does?</i></blockquote>

Multiplying by 4294967296 (2^32) gives 32 bits for the decimal portion
+,
leaving no bits (32-32) for the integer portion. In other words, B0.
1 does not fit into no bits.

   +   +---+-----
     1 | 0 | ...         1 B0
   +   +---+-----

Multiplying by 2147483648 (2^31) gives 31 bits for the decimal portion
+,
leaving 1 bit (32-31) for the integer portion. In other words, B1.
1 fits in one bit.

       +---+---+-----
       | 1 | 0 | ...     1 B1
       +---+---+-----


So what's the advantage over floating-point numbers?
Floating-point numbers does this the same way, and it does it for you.
However, in order to do that, it must save the scaling (called "expone
+nt")
along with the number. That means floats require more memory to store 
+than these.

So what good is any of this anyway?
You can use integer arithmetic on these numbers!

   X Bi + Y Bi = (X+Y) Bi
   X Bi - Y Bi = (X-Y) Bi
   X Bi * Y Bj = (X*Y) B(i+j)
   X Bi / Y Bj = (X/Y) B(i-j)
   X Bi >> j   = X B(i+j)
   X Bi << j   = X B(i-j)

You can also compare numbers of the same scaling.

   X Bi <  Y Bi
   X Bi >  Y Bi
   X Bi == Y Bi
   etc.

Do be careful about overflow!

   A = ...    # B1
   B = ...    # B1
   C = A + B  # POTENTIAL OVERFLOW!
              # 1 B1 + 1 B1 will overflow, for example

There are two ways of avoiding overflow.
You can make sure in advance that the numbers won't cause an overflow,
+ or
you can switch to a different B-scaling (at the cost of a bit of preci
+sion).

   A = ...     # B1
   A = A >> 1  # B2
   B = ...     # B1
   B = B >> 1  # B2
   C = A + B   # B2

We do lots of works with values in [0.0..1.0]. When we need to calcula
+te by
how much a valve should be open, we calculate it using values in [0.0.
+.1.0].
Later on, we convert them to the right value to send to the Analog Out
+put.
It sounds like you want to deal with numbers in this same range, so I 
+thought
the following would be very useful to you:

   A = ...          # B1  Must be between 0.0 and 1.0.
   B = ...          # B1  Must be between 0.0 and 1.0.
   C = A * B        # B2
   C = C << 1       # B1  Safe, since 1.0 * 1.0 = 1.0.

Although we've only dealt unsigned numbers, everything
I've said so far applies to signed 2s complement numbers.
[download]
Comment on Fixed Point Numbers Download Code

Problems? Is your data what you think it is?
	PerlMonks