in reply to Re^6: RFC: Large Floating Point Numbers - Rounding Errors
in thread RFC: Large Floating Point Numbers - Rounding Errors

... this is a borderline case.

Because YOU haven't set the rounding mode:

#include <stdio.h> #include <float.h> int main( int argc, char **argv ) { double n; printf( "Setting mode CHOP; controlFP returned: %u\n", _controlfp( _RC_CHOP, _MCW_RC ) ); for( n = 0.000005; n < 0.0001; n += 0.00001 ) { printf( "%.15f %.5f\n", n, n ); } printf( "Setting mode UP; controlFP returned: %u\n", _controlfp( _RC_UP, _MCW_RC ) ); for( n = 0.000005; n < 0.0001; n += 0.00001 ) { printf( "%.15f %.5f\n", n, n ); } printf( "Setting mode DOWN; controlFP returned: %u\n", _controlfp( _RC_DOWN, _MCW_RC ) ); for( n = 0.000005; n < 0.0001; n += 0.00001 ) { printf( "%.15f %.5f\n", n, n ); } printf( "Setting mode NEAR; controlFP returned: %u\n", _controlfp( _RC_NEAR, _MCW_RC ) ); for( n = 0.000005; n < 0.0001; n += 0.00001 ) { printf( "%.15f %.5f\n", n, n ); } return 1; }
C:\test>ieeeroundingmode.exe Setting mode CHOP; controlFP returned: 525087 0.000005000000000 0.00001 0.000015000000000 0.00002 0.000025000000000 0.00003 0.000035000000000 0.00003 0.000045000000000 0.00004 0.000055000000000 0.00005 0.000065000000000 0.00006 0.000075000000000 0.00007 0.000085000000000 0.00008 0.000095000000000 0.00009 Setting mode UP; controlFP returned: 524831 0.000005000000000 0.00001 0.000015000000000 0.00002 0.000025000000000 0.00003 0.000035000000000 0.00004 0.000045000000000 0.00005 0.000055000000000 0.00006 0.000065000000000 0.00007 0.000075000000000 0.00008 0.000085000000000 0.00009 0.000095000000000 0.00010 Setting mode DOWN; controlFP returned: 524575 0.000005000000000 0.00001 0.000015000000000 0.00002 0.000025000000000 0.00003 0.000035000000000 0.00003 0.000045000000000 0.00004 0.000055000000000 0.00005 0.000065000000000 0.00006 0.000075000000000 0.00007 0.000085000000000 0.00008 0.000095000000000 0.00009 Setting mode NEAR; controlFP returned: 524319 0.000005000000000 0.00001 0.000015000000000 0.00002 0.000025000000000 0.00003 0.000035000000000 0.00004 0.000045000000000 0.00005 0.000055000000000 0.00006 0.000065000000000 0.00007 0.000075000000000 0.00008 0.000085000000000 0.00009 0.000095000000000 0.00010

Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.

Replies are listed 'Best First'.
Re^8: RFC: Large Floating Point Numbers - Rounding Errors
by GAVollink (Novice) on Sep 08, 2011 at 19:52 UTC
    Sadly, Solaris doesn't implement _controlFP (nor does Inline exist in this environment). There is a Solaris method by which I can ask what the rounding method is on this system, just no way to set it.

    Like I said, something bigger is wrong here, and no, I shouldn't have to use a string manipulation program to deal with this, but ... I'm pretty stuck otherwise.
      Sadly, Solaris doesn't implement _controlFP()

      In that case, maybe something like this would work for you?

      sub rnd{ my( $p, $n ) = @_; return sprintf "%.${p}f", $n + "0.5e-$p" };; print $_, ': ', rnd( 5, $_ ) for 0.000005, 0.000015, 0.000025, 0.000035, 0.000045, 0.000055, 0.000065, 0.000075, 0.000085, 0.000095;; 5e-006 : 0.00001 1.5e-005 : 0.00002 2.5e-005 : 0.00003 3.5e-005 : 0.00004 4.5e-005 : 0.00005 5.5e-005 : 0.00006 6.5e-005 : 0.00007 7.5e-005 : 0.00008 8.5e-005 : 0.00009 9.5e-005 : 0.00010

      Even if it isn't faster than your code -- which instinct tells me it should be -- then it is at least a darn sight clearer :)


      Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
      "Science is about questioning the status quo. Questioning authority".
      In the absence of evidence, opinion is indistinguishable from prejudice.

        It's clear enough to plainly see that you are adding something to the number, then rounding it, which means that in any thing close to 0.49e-${p}, you'll force a round-up, so often wrong, in the other direction.

        Being 9.0e-5 + 0.5e-5 is 9.5e-5 -- So, you are forcing it to always add "up".

        Just adding a single 0.00009 at the end of your code, shows how bad this can be. However, the same problem also happens (with a smaller error area) if you try a higher exponent -- it simply intentionally inserts an error, which is a really dangerous way to try to correct a problem.

        sub rnd{ my( $p, $n ) = @_; return sprintf "%.${p}f", $n + "0.5e-$p" };; print $_, ': ', rnd( 5, $_ ), qq{\n} for 0.000005, 0.000015, 0.000025, 0.000035, 0.000045, 0.000055, 0.000065, 0.000075, 0.000085, 0.000095, 0.00009;; __DATA__ 5e-06: 0.00001 1.5e-05: 0.00002 2.5e-05: 0.00003 3.5e-05: 0.00004 4.5e-05: 0.00005 5.5e-05: 0.00006 6.5e-05: 0.00007 7.5e-05: 0.00008 8.5e-05: 0.00009 9.5e-05: 0.00010 9e-05: 0.00010
Re^8: RFC: Large Floating Point Numbers - Rounding Errors
by salva (Canon) on Sep 08, 2011 at 16:48 UTC
    You are cheating! can't you see it?
      You are cheating!

      Cheating? By having the compiler produce the correct, required values? By using the FP processor the way it was designed?

      Now I heard all the excuses for not understanding the oft-cited but little understood paper http://docs.sun.com/source/806-3568/ncg_goldberg.html


      Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
      "Science is about questioning the status quo. Questioning authority".
      In the absence of evidence, opinion is indistinguishable from prejudice.
        This is for GCC/Linux:
        #include <stdio.h> #include <stdlib.h> #include <fenv.h> double n[] = { 0.000005, 0.000015, 0.000025, 0.000035, 0.000045, 0.000 +055, 0.000065 }; char *str[] = { "0.000005", "0.000015", "0.000025", "0.000035", "0.000 +045", "0.000055", "0.000065" }; int main(int argc, char *argv[]) { int i, j; for (j = 0; j < 4; j++) { fesetround(j); printf("rounding: %d\n", j); double m = n[0]; for (i = 0; i < sizeof(n)/sizeof(*n); i++) { printf(" compiler: %40.30a => %40.30f\n", n[i], n[i]); printf(" atof: %40.30a => %40.30f\n", atof(str[i]), a +tof(str[i])); printf(" calc: %40.30a => %40.30f\n", m, m); m += 0.00001; } } return 0; }
        rounding: 0 compiler: 0x1.4f8b588e368f100000000000000000p-18 => 0.0000 +05000000000000000409015270 atof: 0x1.4f8b588e368f100000000000000000p-18 => 0.0000 +05000000000000000409015270 calc: 0x1.4f8b588e368f100000000000000000p-18 => 0.0000 +05000000000000000409015270 compiler: 0x1.f75104d551d6900000000000000000p-17 => 0.0000 +15000000000000000380012861 atof: 0x1.f75104d551d6900000000000000000p-17 => 0.0000 +15000000000000000380012861 calc: 0x1.f75104d551d6a00000000000000000p-17 => 0.0000 +15000000000000002074078756 compiler: 0x1.a36e2eb1c432d00000000000000000p-16 => 0.0000 +25000000000000001198043401 atof: 0x1.a36e2eb1c432d00000000000000000p-16 => 0.0000 +25000000000000001198043401 calc: 0x1.a36e2eb1c432e00000000000000000p-16 => 0.0000 +25000000000000004586175190 compiler: 0x1.2599ed7c6fbd200000000000000000p-15 => 0.0000 +34999999999999996933876256 atof: 0x1.2599ed7c6fbd200000000000000000p-15 => 0.0000 +34999999999999996933876256 calc: 0x1.2599ed7c6fbd300000000000000000p-15 => 0.0000 +35000000000000003710139834 compiler: 0x1.797cc39ffd60f00000000000000000p-15 => 0.0000 +45000000000000002834104479 atof: 0x1.797cc39ffd60f00000000000000000p-15 => 0.0000 +45000000000000002834104479 calc: 0x1.797cc39ffd60f00000000000000000p-15 => 0.0000 +45000000000000002834104479 compiler: 0x1.cd5f99c38b04b00000000000000000p-15 => 0.0000 +55000000000000001958069124 atof: 0x1.cd5f99c38b04b00000000000000000p-15 => 0.0000 +55000000000000001958069124 calc: 0x1.cd5f99c38b04b00000000000000000p-15 => 0.0000 +55000000000000001958069124 compiler: 0x1.10a137f38c54300000000000000000p-14 => 0.0000 +64999999999999994305770190 atof: 0x1.10a137f38c54300000000000000000p-14 => 0.0000 +64999999999999994305770190 calc: 0x1.10a137f38c54400000000000000000p-14 => 0.0000 +65000000000000007858297346 rounding: 1 compiler: 0x1.4f8b588e368f100000000000000000p-18 => 0.0000 +05000000000000000409015270 atof: 0x1.4f8b588e368f100000000000000000p-18 => 0.0000 +05000000000000000409015270 calc: 0x1.4f8b588e368f100000000000000000p-18 => 0.0000 +05000000000000000409015270 compiler: 0x1.f75104d551d6900000000000000000p-17 => 0.0000 +15000000000000000380012861 atof: 0x1.f75104d551d6900000000000000000p-17 => 0.0000 +15000000000000000380012861 calc: 0x1.f75104d551d6a00000000000000000p-17 => 0.0000 +15000000000000002074078756 compiler: 0x1.a36e2eb1c432d00000000000000000p-16 => 0.0000 +25000000000000001198043401 atof: 0x1.a36e2eb1c432d00000000000000000p-16 => 0.0000 +25000000000000001198043401 calc: 0x1.a36e2eb1c432e00000000000000000p-16 => 0.0000 +25000000000000004586175190 compiler: 0x1.2599ed7c6fbd200000000000000000p-15 => 0.0000 +34999999999999996933876256 atof: 0x1.2599ed7c6fbd200000000000000000p-15 => 0.0000 +34999999999999996933876256 calc: 0x1.2599ed7c6fbd300000000000000000p-15 => 0.0000 +35000000000000003710139834 compiler: 0x1.797cc39ffd60f00000000000000000p-15 => 0.0000 +45000000000000002834104479 atof: 0x1.797cc39ffd60f00000000000000000p-15 => 0.0000 +45000000000000002834104479 calc: 0x1.797cc39ffd60f00000000000000000p-15 => 0.0000 +45000000000000002834104479 compiler: 0x1.cd5f99c38b04b00000000000000000p-15 => 0.0000 +55000000000000001958069124 atof: 0x1.cd5f99c38b04b00000000000000000p-15 => 0.0000 +55000000000000001958069124 calc: 0x1.cd5f99c38b04b00000000000000000p-15 => 0.0000 +55000000000000001958069124 compiler: 0x1.10a137f38c54300000000000000000p-14 => 0.0000 +64999999999999994305770190 atof: 0x1.10a137f38c54300000000000000000p-14 => 0.0000 +64999999999999994305770190 calc: 0x1.10a137f38c54400000000000000000p-14 => 0.0000 +65000000000000007858297346 rounding: 2 compiler: 0x1.4f8b588e368f100000000000000000p-18 => 0.0000 +05000000000000000409015270 atof: 0x1.4f8b588e368f100000000000000000p-18 => 0.0000 +05000000000000000409015270 calc: 0x1.4f8b588e368f100000000000000000p-18 => 0.0000 +05000000000000000409015270 compiler: 0x1.f75104d551d6900000000000000000p-17 => 0.0000 +15000000000000000380012861 atof: 0x1.f75104d551d6900000000000000000p-17 => 0.0000 +15000000000000000380012861 calc: 0x1.f75104d551d6a00000000000000000p-17 => 0.0000 +15000000000000002074078756 compiler: 0x1.a36e2eb1c432d00000000000000000p-16 => 0.0000 +25000000000000001198043401 atof: 0x1.a36e2eb1c432d00000000000000000p-16 => 0.0000 +25000000000000001198043401 calc: 0x1.a36e2eb1c432e00000000000000000p-16 => 0.0000 +25000000000000004586175190 compiler: 0x1.2599ed7c6fbd200000000000000000p-15 => 0.0000 +34999999999999996933876256 atof: 0x1.2599ed7c6fbd200000000000000000p-15 => 0.0000 +34999999999999996933876256 calc: 0x1.2599ed7c6fbd300000000000000000p-15 => 0.0000 +35000000000000003710139834 compiler: 0x1.797cc39ffd60f00000000000000000p-15 => 0.0000 +45000000000000002834104479 atof: 0x1.797cc39ffd60f00000000000000000p-15 => 0.0000 +45000000000000002834104479 calc: 0x1.797cc39ffd60f00000000000000000p-15 => 0.0000 +45000000000000002834104479 compiler: 0x1.cd5f99c38b04b00000000000000000p-15 => 0.0000 +55000000000000001958069124 atof: 0x1.cd5f99c38b04b00000000000000000p-15 => 0.0000 +55000000000000001958069124 calc: 0x1.cd5f99c38b04b00000000000000000p-15 => 0.0000 +55000000000000001958069124 compiler: 0x1.10a137f38c54300000000000000000p-14 => 0.0000 +64999999999999994305770190 atof: 0x1.10a137f38c54300000000000000000p-14 => 0.0000 +64999999999999994305770190 calc: 0x1.10a137f38c54400000000000000000p-14 => 0.0000 +65000000000000007858297346 rounding: 3 compiler: 0x1.4f8b588e368f100000000000000000p-18 => 0.0000 +05000000000000000409015270 atof: 0x1.4f8b588e368f100000000000000000p-18 => 0.0000 +05000000000000000409015270 calc: 0x1.4f8b588e368f100000000000000000p-18 => 0.0000 +05000000000000000409015270 compiler: 0x1.f75104d551d6900000000000000000p-17 => 0.0000 +15000000000000000380012861 atof: 0x1.f75104d551d6900000000000000000p-17 => 0.0000 +15000000000000000380012861 calc: 0x1.f75104d551d6a00000000000000000p-17 => 0.0000 +15000000000000002074078756 compiler: 0x1.a36e2eb1c432d00000000000000000p-16 => 0.0000 +25000000000000001198043401 atof: 0x1.a36e2eb1c432d00000000000000000p-16 => 0.0000 +25000000000000001198043401 calc: 0x1.a36e2eb1c432e00000000000000000p-16 => 0.0000 +25000000000000004586175190 compiler: 0x1.2599ed7c6fbd200000000000000000p-15 => 0.0000 +34999999999999996933876256 atof: 0x1.2599ed7c6fbd200000000000000000p-15 => 0.0000 +34999999999999996933876256 calc: 0x1.2599ed7c6fbd300000000000000000p-15 => 0.0000 +35000000000000003710139834 compiler: 0x1.797cc39ffd60f00000000000000000p-15 => 0.0000 +45000000000000002834104479 atof: 0x1.797cc39ffd60f00000000000000000p-15 => 0.0000 +45000000000000002834104479 calc: 0x1.797cc39ffd60f00000000000000000p-15 => 0.0000 +45000000000000002834104479 compiler: 0x1.cd5f99c38b04b00000000000000000p-15 => 0.0000 +55000000000000001958069124 atof: 0x1.cd5f99c38b04b00000000000000000p-15 => 0.0000 +55000000000000001958069124 calc: 0x1.cd5f99c38b04b00000000000000000p-15 => 0.0000 +55000000000000001958069124 compiler: 0x1.10a137f38c54300000000000000000p-14 => 0.0000 +64999999999999994305770190 atof: 0x1.10a137f38c54300000000000000000p-14 => 0.0000 +64999999999999994305770190 calc: 0x1.10a137f38c54400000000000000000p-14 => 0.0000 +65000000000000007858297346
        And rounding is as follows:
        1. 0 - Rounding is towards 0.
        2. 1 - Rounding is towards nearest number.
        3. 2 - Rounding is towards positive infinity.
        4. 3 - Rounding is towards negative infinity.
        Cheating? By having the compiler produce the correct, required values? By using the FP processor the way it was designed?

        No, obviusly that's not the reason why you are cheating!