How about fixed point arithmetic? If 18 digits really is "good enough", then a signed 64 bit int will give you 18 digits (int part plus 17 decimal places). If you need more, does the compiler have a 128 bit int type? If not, you could code routines that cascade a signed 64 with an unsigned 64 to give you a signed 128.
| [reply] |
This is what is tickling the back of my brain; but I'm stuck as to how to implement it?
As an example of when I run out of precision: I have the value: 0.0000000045 that I have to multiply by itself: 0.00000000000000002025 which becomes 0.0000000000000000 with doubles.
I'm thinking about something along the lines of storing that as: [-10, 45]; multiply it by itself and I get [-20, 2025].
Using a single byte for the fixed point will get me down to +-1e-127 which is probably more than enough to be going on with.
With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
| [reply] [d/l] [select] |
| [reply] |
0.00000000000000002025 which becomes 0.0000000000000000 with doubles.
Thinking on this more, I wonder if you are really having string-ification problem. Digits could be getting lost in that process. The value in normalized notation is 2.025 * 10^-17, which is well within the capabilities of doubles, which use double-precision floating-point format
Update: Fixed a typo.
| [reply] [d/l] |
0.00000000000000002025 is 21 digits, so obviously you need more than 18 digits.
Using a single byte for the fixed point will get me down to +-1e-127
I think you are misunderstanding "fixed point". Fixed point means the exponent part is always the same. For example, with 64 bit signed integers:
- 0.0000000045 is represented by the integer value 450000000
- Your maximum value of 4.0 would be represented by the integer value 400000000000000000
- Your minimum, -4.0 would be -400000000000000000
But you need more than 18 digits, so you need to cascade. Though I misspoke. What you cascade depends the underlying hardware. For a PC with 64 bit integers, you need to cascade 3 or 4 32 bit integers to your needed digits. For a PC with 32 bit integers, you need to cascade at least 5 16 bit integers.
Let's assume your PC has 64 bit integers. To add 2 numbers, A and B, you split them into Ah, Am, Al, Bh, Bm and Bl, then do the additions:
/* C */
sint64_t A, B, Ah, Am, Al, Bh, Bm, Bl, Ch, Cl, /* etc */;
sint128 C;
Al = A & 0xFFFFFFFF;
Am = A >> 32;
Ah = A >> 64;
Bl = B & 0xFFFFFFFF;
Bm = A >> 32;
Bh = A >> 64;
Cl = Al + Bl;
Cm = Am + Bm + (Cl >> 32);
Ch = Ah + Bh + (Cm >> 32);
C = (Ch << 64) + ((Cm & 0xFFFFFFFF) << 32) + (Cl & 0xFFFFFFFF);
Of course, there are some easy optimizations to this I leave as an exercise for the reader. Also leaving subtraction, multiplication and division as exercises for the reader.
Note: sint64_t and sint128_t are the C99 standard type names for "exactly 64/128 bit, signed integer"
Welcome to the world of "bit bashing".
| [reply] [d/l] [select] |