in reply to Re^7: Math::BigFloat to native double?
in thread Math::BigFloat to native double?
They still don't add up to something near the original input value - but nor should they.0x1.921fb54442d18p+1 => 3.1415926535897931 0x1.1a62633145c07p-53 => 1.2246467991473532e-16
The sum of which is:0x1.921fb54442d18p+1 => 3.14159265358979311599796346854419 0x1.1a62633145c07p-53 => 1.2246467991473532071737640294584e-16
which outputs (when run with the "-l" switch):use Math::BigFloat;; $n = Math::BigFloat->new( '3.1415926535897932384626433832795' );; $d = 0 + $n->bstr;; printf "%.17f\n", $d;; printf "%a\n", $d;; $bfd = Math::BigFloat->new( sprintf "%.17f", $d );; print $bfd;; $n -= $bfd;; print $n;; printf "%a\n", $n;;
So the most significant double agrees with my ppc box, but the value of the least significant double differs.3.14159265358979310 0x1.921fb54442d18p+1 3.1415926535897931 0.0000000000000001384626433832795 0x1.3f45eb146ba31p-53
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re^9: Math::BigFloat to native double?
by BrowserUk (Patriarch) on Jul 13, 2015 at 15:44 UTC | |
They still don't add up to something near the original input value - but nor should they. ... if we want to add in base 10, we need to first convert those hex values to *106* bit precision decimal values. Okay. That makes no sense to me at all. Given this comes from you, and is math, I pretty much know I'm wrong here. But ... And now I'm wondering if your hardware double-double implementation is the same thing as the double-double I was referring to. Which in summary, uses pairs of doubles, to achieve greater precision by splitting the values into two parts and storing the hi part and the lo part in the pair. Thus, when the two parts are added together, they form the complete value. Now, the two parts cannot have more than 53-bits of precision each; so the idea that "if we want to add in base 10, we need to first convert those hex values to *106* bit precision decimal values." doesn't make sense to me. Where does the extra precision for each of the two values, before combining them, come from? Half of my brain is saying: this is the classic binary representation of decimal values thing; but in reverse. Not so sure that splitting the values on the basis of the Math::BigFloat values can provide correct 106-bit values And the other half is saying: M::BF may not be the fastest thing in the arbitrary precision world; but it is arbitrary precision, so surely when it gets there, the results are accurate? By now you're probably slapping your forehead saying: "Why doesn't he just install Math::MPFR!" And the answer is, I don't want an arbitrary precision or multi-precision library. I'm only using M::BF because it was on my machine and a convenient (I thought) way to test a couple of things to do with my own implementation of the double-double thing (per the paper I linked). One of which is to write my own routines for outputting in hex-float format (done) and converting to decimal. I was working on the latter when I need to generate the binary decimal constants and here I am. So why not just use someone else's library? Oooh! I feel so much better for having got that off my chest :) I realise that I've probably lost your patronage for my endeavour in the process; but so be it. With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.
I'm with torvalds on this Agile (and TDD) debunked I told'em LLVM was the way to go. But did they listen!
| [reply] |
by syphilis (Archbishop) on Jul 14, 2015 at 03:45 UTC | |
It was a poorly expressed explanation. The most significant double is 0x1.921fb54442d18p+1. (We agree on this, at least.) That string expresses an exact value, but 3.1415926535897931 is only a very poor approximation of that exact value. The least significant double is, according to me, 0x1.1a62633145c07p-53. That string expresses an exact value, but 1.2246467991473532e-16 is only a very poor approximation of that exact value. So ... my doubledouble contains a value that is exactly the sum of both: 0x1.921fb54442d18p+1 + 0x1.1a62633145c07p-53 But you can't expect the sum of the 2 rough decimal approximations to be equivalent to the exact hex sum of the 2 hex numbers. (And it's not, of course.) The least significant double that your approach came up with was 0x1.3f45eb146ba31p-53. Your decimal approximation of that exact value is 0.0000000000000001384626433832795. When you add your 2 decimal approximations together you end up with the input value - but that's false comfort. The actual value contained by your doubledouble is, according to the way I look at them, is not really the sum of the 2 decimal approximations - it's the sum of the 2 hex values: 0x1.921fb54442d18p+1 + 0x1.3f45eb146ba31p-53. That corresponds to a base 10 value of 3.141592653589793254460606851823683 (which differs significantly from the input). Your doubledouble, expressed in base 2 is: 11.001001000011111101101010100010001000010110100011000010011111101000101111010110001010001101011101000110001 which is not the correct 107 bit representation of the input value. If you use that doubledouble as your pi approximation, then you'll only get incorrect results. FWIW, if I want to calculate the double-double representation of a specific value, I do essentially the same as you, except that I use Math::MPFR instead of Math::BigFloat. And I set precision to 2098 bits. I have: 2098 bits is overkill for the vast majority of conversions. It stems from the fact that the doubledouble can accurately represent certain values up to (but not exceeding) 2098 bits. For example, on my PPC box I can assign $x = (2 **1023) + (2 ** -1074); and the doubledouble $x will consist of the 2 doubles 2**1023 and 2**-1074, thereby accurately representing that 2098 bit value. The value of this capability is limited - multiply $x by (say) 1.01 and all of that additional precision is lost. The result is the same as multiplying 2**1023 by 1.01. Anyway ... first question is "how to set your doubledouble pi correctly using Math::BigFloat ?". I couldn't quickly come up with an answer, but I'll give it some more thought later on today. Cheers, Rob | [reply] [d/l] [select] |
by BrowserUk (Patriarch) on Jul 14, 2015 at 05:11 UTC | |
This is what I have so far for my input/output routines:
The names don't make much sense at the moment but And this is the output from the above and the decimal calculation that appears to show it works:
With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.
I'm with torvalds on this Agile (and TDD) debunked I told'em LLVM was the way to go. But did they listen!
| [reply] [d/l] [select] |
by syphilis (Archbishop) on Jul 14, 2015 at 13:14 UTC | |
by syphilis (Archbishop) on Jul 15, 2015 at 14:58 UTC | |
| |