So I'm still short on time but here is a proof-of-concept for fixed-point arithmetic with 128-bit ints in Perl I threw together based on the video tutorials. The same logic used would apply to any other language as well.
I've used Math::Int128 as my 128-bit data type here, but if the target language doesn't support 128-bit ints natively, it should be possible to find a library (or implement one oneself) that abstracts out addition, subtraction, multiplication, and shifts on a custom 128-bit integer type (e.g. 2x 64-bit ints).
Some notes on the code: The Math::Big* modules are for conversion to/from fixed-point only; the idea being that you first bring all your numbers into the fixed-point domain and then do all your work there. The scale of 1/2**124 means there are 4 bits before the decimal point, which is enough for the stated range of -4 to 4. The multiplication method implemented in mul_fixed is the "fast but less accurate" one (shift first, then multiply), because the most accurate one requires a 256-bit output (multiply first, then shift). As explained in the video tutorial, if the multiplicands are in a known range, it's possible to play with the shifting so that less accuracy is lost.
#!/usr/bin/env perl
use strict;
use warnings;
use Math::BigFloat;
use Math::BigInt;
use Math::Int128 qw(int128 int128_to_hex);
use Math::Int128::die_on_overflow;
$\ = $/;
# https://www.youtube.com/watch?v=S12qx1DwjVk
my $SCALE_BITS = 124;
my $SCALE = Math::BigInt->new(2)->bpow($SCALE_BITS);
print " Scale: 2**$SCALE_BITS = $SCALE";
sub number2fixed { # Math::Big* --> fixed-pt Math::Int128
return int128( shift->copy->bmul($SCALE)->as_int->bstr );
}
sub fixed2number { # fixed-pt Math::Int128 --> Math::BigFloat
return scalar Math::BigFloat->new(''.shift)->bdiv($SCALE);
}
my $EPSILON = fixed2number(int128(1));
print "Epsilon: ", $EPSILON->bstr;
print "//// example: Pi";
my $pi1 = Math::BigFloat->bpi(45);
print " orig: $pi1";
my $fpi = number2fixed($pi1);
print "fixed: ", int128_to_hex($fpi);
my $pi2 = fixed2number($fpi);
print " conv: $pi2";
# https://www.youtube.com/watch?v=npQF28g6s_k
sub mul_fixed { # 2x Math::Int128 --> Math::Int128
return (shift>>($SCALE_BITS/2)) * (shift>>($SCALE_BITS/2));
}
print "//// properties of mul_fixed";
print "smallest possible multiplicand: ",
fixed2number(int128(1)<<($SCALE_BITS/2));
# demo effect of the smallest possible multiplicand
my $mul = number2fixed(Math::BigInt->new(3));
my $toosmall = number2fixed(
Math::BigFloat->new("0.000000000000000000216"));
print " too small: ", int128_to_hex($toosmall);
print "\t* 3 = ", fixed2number(mul_fixed($mul,$toosmall));
my $justenough = number2fixed(
Math::BigFloat->new("0.000000000000000000217") );
print "just enough: ", int128_to_hex($justenough);
print "\t* 3 = ", fixed2number(mul_fixed($mul,$justenough));
print "//// test case: (45e-10)**2";
my $val = Math::BigFloat->new("0.0000000045");
print " input value: ", $val->bstr;
my $fval = number2fixed($val);
print " as fixed-pt: ", int128_to_hex($fval);
my $p = mul_fixed($fval,$fval);
print "fixed-pt mul: ", int128_to_hex($p);
print " converted: ", fixed2number($p);
print " expected: ", $val->copy->bpow(2);
Output:
Scale: 2**124 = 21267647932558653966460912964485513216
Epsilon: 0.00000000000000000000000000000000000004701977403289150031874
+946148888982711275
//// example: Pi
orig: 3.14159265358979323846264338327950288419716940
fixed: 3243F6A8885A308D313198A2E0370734
conv: 3.141592653589793238462643383279502884184
//// properties of mul_fixed
smallest possible multiplicand: 0.000000000000000000216840434497100886
+8014905601739883422852
too small: 00000000000000003FC07FA1F14C2C48
* 3 = 0
just enough: 0000000000000000400C0E7219869986
* 3 = 0.0000000000000000006505213034913026604044716805219650268555
//// test case: (45e-10)**2
input value: 0.0000000045
as fixed-pt: 00000001353CD652BB1674942F2A17B4
fixed-pt mul: 000000000000001758BEBD8531A08964
converted: 0.000000000000000020249999998198227269026294503547169656
+24
expected: 0.00000000000000002025
In other words: You've got a little better than 36 decimal places after the point for storage, addition and subtraction. In multiplication, as implemented here, the inputs are currently limited to a bit more than 17 decimal places.
|