whakka has asked for the wisdom of the Perl Monks concerning the following question:

Here I go doing this on a Windows machine:
C:\>perl -e "print ( ( -95.3009707988281 + -95.1877600585938 ) / 2 . \ +"\n\" )" -95.244365428711
And this on a Unix machine:
$ perl -e 'print ( ( -95.3009707988281 + -95.1877600585938 ) / 2 . "\n +" )' -95.2443654287109
Both on the same version of Perl (5.8.8). What gives?

Replies are listed 'Best First'.
Re: Decimal precision issue: Windows vs. Unix
by merlyn (Sage) on Jan 09, 2009 at 21:39 UTC
    You're worried that your floating point value (an approximation to begin with) is showing a rounding error equivalent to one meter out of the distance to the nearest star (except for our sun)?

    Why?

      They need to be exact because they're being used for look-up values. But when you put it like that I should be rounding in the first place. Think of my question as academic :)

      Edit: Okay you answered it in your post - they're approximations. To be more specific, I guess this behavior is expected - is it operating system dependent or processor or what?

        This statement:
        They need to be exact because they're being used for look-up values.
        and this one:
        I should have mentioned it's a lookup value in a database table.

        suggest to me that there's some sort of cognitive mismatch between what your code plus database are supposed to accomplish, and what you are actually trying to implement. If the various observations here about the inherent inexactness of floating point values don't solve your database lookup problem, you may need to start another thread about what the real problem is (trying to do database lookups on the basis of computed values or something like that).

        If it's a lookup value, I assume it's a constant, therefore, why calculate it at runtime at all?
        If you're interested, check my scratchpad for some code to test accuracy based on algorithms from an astronomy book.
Re: Decimal precision issue: Windows vs. Unix
by gone2015 (Deacon) on Jan 10, 2009 at 00:01 UTC

    To look at this more closely I tried:

    my $a = -95.3009707988281 ; show($a) ; my $b = -95.1877600585938 ; show($b) ; my $x = ($a + $b) / 2 ; show($x) ; printf "%20.16f\n", $x ; sub show { my ($f) = @_ ; printf "%-18s 0x%04X_%04X_%04X_%04X\n", $f, unpack("n4", pack("d>", +$f)) ; } ;
    this gave:
    Windows, perl 5.10.0, 32-bit               Linux, perl 5.10.0, 64-bit
    -95.3009707988281  0xC057_D343_1B06_8122   -95.3009707988281  0xC057_D343_1B06_8122
    -95.1877600585938  0xC057_CC04_42C3_C9F2   -95.1877600585938  0xC057_CC04_42C3_C9F2
    -95.244365428711   0xC057_CFA3_AEE5_258A   -95.2443654287109  0xC057_CFA3_AEE5_258A
    -95.2443654287109500                       -95.2443654287109496
    
    which shows that the decimal to binary conversion is giving the same result, as is the arithmetic. What is different is the binary to decimal conversion. It would appear that Perl is stringifying to 15 significant decimal digits, discarding any trailing zeros. Other experiments suggest that the library under Windows is returning 17 significant decimal digits, but the library under Linux is returning rather more -- giving a different result when rounded to 15 decimal digits.

    Binary/Decimal conversion is a whole lot trickier than it looks. Producing no more than 17 decimal digits is not unreasonable for IEEE 754 double precision floats. On the face of it, however, what we have here is a double rounding under Windows, which I think is incorrect.

    Nevertheless, it is "ambitious" to expect any two floating point values to be exactly equal !

      17 digits of precision is the max for doubles. Are you aware that your linux perl is using long doubles?
        Are you aware that your linux perl is using long doubles?

        I don't think it is. For the expression presented in the original post in this thread, on linux (without long double support) I get the same -95.2443654287109, but on the same machine (and OS), when perl has been built with -Duselongdouble, the output is -95.24436542871095.

        Of course, the output of perl -V:archname would tell us for sure.

        Cheers,
        Rob

        No, I'm not aware of it using long doubles. perl -V gives:

        Summary of my perl5 (revision 5 version 10 subversion 0) configuration:
          Platform:
            osname=linux, osvers=2.6.18-92.1.10.el5, archname=x86_64-linux-thread-multi
        ....
            intsize=4, longsize=8, ptrsize=8, doublesize=8, byteorder=12345678
            d_longlong=define, longlongsize=8, d_longdbl=define, longdblsize=16
            ivtype='long', ivsize=8, nvtype='double', nvsize=8, Off_t='off_t', lseeksize=8
            alignbytes=8, prototype=define
        ....
        
        which suggests to me that it's using ordinary 8 byte doubles.

Re: Decimal precision issue: Windows vs. Unix
by DStaal (Chaplain) on Jan 09, 2009 at 21:44 UTC

    If they aren't exactly the same machine, I'd say that's likely a hardware issue: You're dealing with the limits of the precision of the machine, and how the processor rounds numbers when it can't give the exact value.

    If you need to work with numbers like this, take good look at Math::BigInt.

      I imaging that differences in the C library could affect the result as well. In fact, I'd say it's likely here since both numbers are the same, just with a different number of digits being printed.

      Update: Confirmed

      $x = ( -95.3009707988281 + -95.1877600585938 ) / 2; $h = uc unpack 'H*', reverse pack 'd', $x; printf("%s\n%.16e\n%s\n", $h, $x, $x);

      gives

      # MS cl, Windows, x86 C057CFA3AEE5258A -9.5244365428710950e+001 -95.244365428711 # gcc, linux, x86 C057CFA3AEE5258A -9.5244365428710950e+01 -95.2443654287109

      The number is exactly the same, it's just the conversion to text by the C library that outputs a different number of digits.

Re: Decimal precision issue: Windows vs. Unix
by davido (Cardinal) on Jan 10, 2009 at 06:31 UTC

    As for the "why" of rounding errors associated with the representation of floating point base-ten numbers using a binary subsystem, see perlfaq4, and perlnumber. This is a pretty common question. On any given day there's about a 5% chance some version of this question will show up here, which earns it a slot in the FAQ.

    To answer why two versions of Perl produce different versions of the rounding error, that's most likely a difference in the build, either as it pertains to the compiler's nuances, or the architecture targeted by the compiler.

    At any rate, floating point numbers really don't make good lookup keys. If absolute precision is required, you're probably better off stringifying and manipulating the stringified representation of numbers by hand, or re-thinking the strategy altogether so as to favor a more reliable lookup method.


    Dave

Re: Decimal precision issue: Windows vs. Unix
by swampyankee (Parson) on Jan 10, 2009 at 00:44 UTC

    It's probably due as much to the C compiler and libraries used to build the particular version of Perl 5.8.8 on each platform. It may be salutary to build (from source) 5.8.8 on the same platform with two different compilers, say MVC and gcc.


    Information about American English usage here and here. Floating point issues? Please read this before posting. — emc

      It may be salutary to build (from source) 5.8.8 on the same platform with two different compilers, say MVC and gcc.

      The gcc compiler (assuming it's the MinGW port) will be using the Microsoft msvcrt.dll anyway. You might be able to find differences between the two if the MVC perl is built with a compiler other than VC6.0, but I find that that Strawberry Perl, ActivePerl, my own gcc-built perl, and my own VC7.0-built perl all produce the same result for the one liner originally presented.

      Cheers,
      Rob

        Rob, thanks for the answer. I wonder if the same sort of differences in floating point behavior could exist between Perl builds on, say, FreeBSD vs different Linux distros, etc.

        Ed

        Information about American English usage here and here. Floating point issues? Please read this before posting. — emc

Re: Decimal precision issue: Windows vs. Unix
by gone2015 (Deacon) on Jan 10, 2009 at 19:37 UTC

    Not to detract from the general point that you should never rely on the exact value of a floating point number...

    ...I am increasingly of the opinion that this demonstates a fault in the Windows library.

    When I try it on my Windows XP 32-bit perl 5.10.0, and on my Linux 64-bit perl 5.10.0, the result of the sum is the same, to whit an IEEE 754 double 0xC057_CFA3_AEE5_258A, which is approximately -95.244365428710949572632671333849

    It appears that when stringifying floating value in IEEE 754 double form, Perl is working to 15 significant decimal digits. This is the usual value -- chosen because this ensures that conversion from decimal to binary and back again gives the original decimal value. Under Windows the perl I have is using sprintf(b, "%.*g", 15, x), where under Linux it is using gcvt(x, 15, b) -- at least that is what use Config tells me d_Gconvert is set to.

    Anyway, as you know, the results are:

    -95.2443654287109 -- Linux -95.244365428710949572632671333849 ~ true value -95.244365428711 -- Windows
    which demonstrates that the Linux value is 'correctly rounded' to 15 significant decimal digits, while the Windows value... isn't. IEEE 754 requires correct rounding for numbers of this sort of size.

    One of the tricky things about binary to decimal conversion is: to do it right you have to delay a rounding step until after you've generated the decimal value, and then round to the required number of decimals. It is reasonable to limit the conversion of IEEE 754 doubles to 17 significant decimal digits -- because that is sufficient for binary to decimal and back to binary to give the original value, for reasonable size values.

    You can see that when rounded to 17 significant digits the value is -95.244365428710950, rounding that to 15 digits gives -95.244365428711. I observe that sprintf('%24.20f', ...) under Windows gives zeros after the 17th significant digit...

      IEEE 754 requires correct rounding for numbers of this sort of size.

      IEEE 754 has four rounding modes(bottom of the page.

      1. Nearest.

        Rounds to the nearest value, except for 5 which is equidistant, in which case it rounds to the nearest even digit.

      2. Up

        Next higher

      3. Down.

        Next lower.

      4. Chop.

        Truncate.

      All are valid.

      The MS CRT provides _control87 and varients to allow the user to choose which mode they require. With the default being nearest.

      In the case of -95.24436542871095 (the fullest precision available from a 8-byte float), with the last digit (5) being equidistant from ...09 and ...10, it rounds to the even value. Per the spec.


      Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
      "Science is about questioning the status quo. Questioning authority".
      In the absence of evidence, opinion is indistinguishable from prejudice.

        As you say, by default one would expect round to nearest. I accept that I did not make that explicit... I apologise to anyone who was confused by my failure to be clear that I was not addressing the possible use of any of the Directed Roundings in this case.

        On the topic of rounding, IEEE 754 says:

        4. Rounding

        Rounding takes a number regarded as infinitely precise and, if necessary, modifies it to fit in the destination's format...

        Section 5.6 of the standard specifies that "conversions shall be correctly rounded as specified in Section 4", for a range of numbers which includes those under discussion.

        So, in the case in point, the rounding decision for the binary decimal conversion to 15 significant decimal digits should start from approximately -95.244365428710949572632671333849 if it is to be 'correctly rounded'. Starting from the already rounded 95.24436542871095 gives the wrong result, as previously discussed.

        For more on the standard there's Supplemental Readings for IEEE 754 / 854. Jerome Coonen's "Contributions to a Proposed Standard for Binary Floating-Point Arithmetic" has a complete chapter on "Accurate yet Economical Binary-Decimal Conversions", which I can recommend.

Re: Decimal precision issue: Windows vs. Unix
by Lawliet (Curate) on Jan 09, 2009 at 22:01 UTC

    Unix wins again! Haha.

    And you didn't even know bears could type.

      $ echo '( -95.3009707988281 + -95.1877600585938 ) / 2' | bc -l -95.24436542871095000000 $
      It seems that both -95.2443654287109 and -95.244365428711 are both approximation, with exactly the same error. It's just that one library decides to round down, the other up.
      eh? Why do you say that? They're the same number.

      It was a joke :\

      I find it funny because most fanboys will find any little thing they can to trump their opponent. In this case it was how the library rounds, or, rather, how far out it rounds.

      Update: :P @ anonny's reply

      And you didn't even know bears could type.

        Please keep your jokes to the joke forum and joke threads :| esp. if they're all this funny :|