in reply to Re^3: subtraction issue
in thread subtraction issue
Feel free to ignore this post. I just couldn't help myself (I needed the coffee-break distraction, apparently)
Interestingly (to me), as detailed in my post in another thread, it takes n fixed-point decimal digits to exactly represent an n-bit fixed-point fractional binary number: ie, given an IEEE-754 "double", which expresses to a precision of 52 digits after the binary-point: to represent 2**-52, it takes 52 digits after the decimal point to represent it exactly in decimal. However, if you just want "good enough" to be able to distinguish between the smallest change (ULP: the smallest binary digit in the floating-point representation; the value of that bit is the smallest change you can see in a decimal number; see Re: Determining the minimum representable increment/decrement possible?) -- for that, you only need enough digits to look at the leading digit of the decimal expansion of 2**-52. To find the location (d) of the leading digit of the value of some fractional number (f), use $d = POSIX::ceil(-POSIX::log10($f)).
If you are using a scientific-notation format (%e), then you only need a precision of 16 digits (%.16e) to uniquely represent each double floating-point value. If you are using a fixed-point notation (%f), then you will need fewer than 16 digits of precision (%.16f) if your abs(number) is greater than 4, but more than 16 digits of precision if your abs(number) is less than 0.5. The code below shows some examples (shamless plug: Data::IEEE754::Tools was written to work even with older perls, where %a wouldn't help me. The to_xxx_floatingpoint() functions show you exact hex and sufficient decimal floating-point representations using p-### to show the power of two, instead of e-### to show the power of 10 -- it makes it closer to the internal IEEE-754 representation).
use warnings; use strict; use Data::IEEE754::Tools qw(to_hex_floatingpoint to_dec_floatingpoint +nextUp nextDown); use 5.022; # for %a printf "%-6.6s %30.30s %30.30s %30.30s %30.30s %30.30s\n", 'approx +', '%30.16f', '%30.16e', '%a', 'to_hex_floatingpoint()', 'to_dec_floa +tingpoint()'; foreach my $v ( map { $_ / 10. } 0 .. 100, 1280, 10240) { # 0 .. 1 +0, 128, 1024 my $d = nextDown($v); my $u = nextUp($v); printf "%-6.1f-ULP %30.16f %30.16e %30.13a %30.30s %30.30s\n", $v, + $d, $d, $d, to_hex_floatingpoint($d), to_dec_floatingpoint($d); printf "%-6.1f %30.16f %30.16e %30.13a %30.30s %30.30s\n", $v, + $v, $v, $v, to_hex_floatingpoint($v), to_dec_floatingpoint($v); printf "%-6.1f+ULP %30.16f %30.16e %30.13a %30.30s %30.30s\n", $v, + $u, $u, $u, to_hex_floatingpoint($u), to_dec_floatingpoint($u); print "\n"; } printf "%-6.6s %30.30s %30.30s %30.30s %30.30s %30.30s\n", 'approx +', '%30.16f', '%30.16e', '%a', 'to_hex_floatingpoint()', 'to_dec_floa +tingpoint()';
snippets of output:
approx %30.16f %30.1 +6e %a to_hex_floatingpoint() + to_dec_floatingpoint() ... 0.4 -ULP 0.4000000000000000 3.9999999999999997e-0 +01 0x1.9999999999999p-2 +0x1.9999999999999p-0002 + +0d1.5999999999999999p-0002 0.4 0.4000000000000000 4.0000000000000002e-0 +01 0x1.999999999999ap-2 +0x1.999999999999ap-0002 + +0d1.6000000000000001p-0002 0.4 +ULP 0.4000000000000001 4.0000000000000008e-0 +01 0x1.999999999999bp-2 +0x1.999999999999bp-0002 + +0d1.6000000000000003p-0002
With .16f, cannot tell the difference between 0.4-ULP and 0.4. However, in full scientific notation (.16e), you can.
0.5 -ULP 0.4999999999999999 4.9999999999999994e-0 +01 0x1.fffffffffffffp-2 +0x1.fffffffffffffp-0002 + +0d1.9999999999999998p-0002 0.5 0.5000000000000000 5.0000000000000000e-0 +01 0x1.0000000000000p-1 +0x1.0000000000000p-0001 + +0d1.0000000000000000p-0001 0.5 +ULP 0.5000000000000001 5.0000000000000011e-0 +01 0x1.0000000000001p-1 +0x1.0000000000001p-0001 + +0d1.0000000000000002p-0001
Format %.16f is sufficient for resolving differences between 0.5-ULP and 0.5 and 0.5+ULP.
... 4.0 -ULP 3.9999999999999996 3.9999999999999996e+0 +00 0x1.fffffffffffffp+1 +0x1.fffffffffffffp+0001 + +0d1.9999999999999998p+0001 4.0 4.0000000000000000 4.0000000000000000e+0 +00 0x1.0000000000000p+2 +0x1.0000000000000p+0002 + +0d1.0000000000000000p+0002 4.0 +ULP 4.0000000000000009 4.0000000000000009e+0 +00 0x1.0000000000001p+2 +0x1.0000000000001p+0002 + +0d1.0000000000000002p+0002 4.1 -ULP 4.0999999999999988 4.0999999999999988e+0 +00 0x1.0666666666665p+2 +0x1.0666666666665p+0002 + +0d1.0249999999999997p+0002 4.1 4.0999999999999996 4.0999999999999996e+0 +00 0x1.0666666666666p+2 +0x1.0666666666666p+0002 + +0d1.0249999999999999p+0002 4.1 +ULP 4.1000000000000005 4.1000000000000005e+0 +00 0x1.0666666666667p+2 +0x1.0666666666667p+0002 + +0d1.0250000000000001p+0002
At 4.0,rounding would show "4.000000000000000", "4.000000000000000", and "4.000000000000001" for %.15f, so you still need 16 digits of precision. But at 4.1, it would show "4.099999999999999", "4.100000000000000", and "4.100000000000001", so %.15f would be sufficient.
... 10.0 -ULP 9.9999999999999982 9.9999999999999982e+0 +00 0x1.3ffffffffffffp+3 +0x1.3ffffffffffffp+0003 + +0d1.2499999999999998p+0003 10.0 10.0000000000000000 1.0000000000000000e+0 +01 0x1.4000000000000p+3 +0x1.4000000000000p+0003 + +0d1.2500000000000000p+0003 10.0 +ULP 10.0000000000000020 1.0000000000000002e+0 +01 0x1.4000000000001p+3 +0x1.4000000000001p+0003 + +0d1.2500000000000002p+0003 128.0 -ULP 127.9999999999999900 1.2799999999999999e+0 +02 0x1.fffffffffffffp+6 +0x1.fffffffffffffp+0006 + +0d1.9999999999999998p+0006 128.0 128.0000000000000000 1.2800000000000000e+0 +02 0x1.0000000000000p+7 +0x1.0000000000000p+0007 + +0d1.0000000000000000p+0007 128.0 +ULP 128.0000000000000300 1.2800000000000003e+0 +02 0x1.0000000000001p+7 +0x1.0000000000001p+0007 + +0d1.0000000000000002p+0007 1024.0-ULP 1023.9999999999999000 1.0239999999999999e+0 +03 0x1.fffffffffffffp+9 +0x1.fffffffffffffp+0009 + +0d1.9999999999999998p+0009 1024.0 1024.0000000000000000 1.0240000000000000e+0 +03 0x1.0000000000000p+10 +0x1.0000000000000p+0010 + +0d1.0000000000000000p+0010 1024.0+ULP 1024.0000000000002000 1.0240000000000002e+0 +03 0x1.0000000000001p+10 +0x1.0000000000001p+0010 + +0d1.0000000000000002p+0010 approx %30.16f %30.1 +6e %a to_hex_floatingpoint() + to_dec_floatingpoint()
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re^5: subtraction issue
by syphilis (Archbishop) on Jan 28, 2017 at 01:47 UTC | |
by pryrt (Abbot) on Jan 28, 2017 at 03:36 UTC | |
by syphilis (Archbishop) on Jan 28, 2017 at 06:59 UTC | |
by pryrt (Abbot) on Jan 28, 2017 at 18:30 UTC | |
by syphilis (Archbishop) on Jan 28, 2017 at 22:34 UTC | |
| |
by BrowserUk (Patriarch) on Jan 28, 2017 at 13:45 UTC | |
by syphilis (Archbishop) on Jan 28, 2017 at 23:10 UTC | |
|