Re: [Win32, C, and way OT] C floats, doubles, and their equivalence

The problem is routed in how v6, (pre-v8 maybe, but I only have v6 and v8), generates the code. The following 'fixes' the problem, though I realise that it may not be a workable solution for you:

#include <stdio.h>

cmpFsFd( float s, double d ) {
    float tmp = (float)d;
    return s == tmp ? 1 : 0;
}

int main(void) {
   double nv = 2.0 / 3;
   float foo = 2.0 / 3;

   if( foo == nv ) printf("True ");
   else printf("False ");

   if( cmpFsFd( foo, nv )) printf("True\n");
   else printf("False\n");

   return 0;
}
[download]

[19:48:23.40} C:\test>cl float.c /Fefloatv8.exe
Microsoft (R) 32-bit C/C++ Optimizing Compiler Version 15.00.30729.01 
+for 80x86
Copyright (C) Microsoft Corporation.  All rights reserved.

float.c
Microsoft (R) Incremental Linker Version 9.00.30729.01
Copyright (C) Microsoft Corporation.  All rights reserved.

/out:floatv8.exe
float.obj

[19:49:59.60} C:\test>floatv8
False True

----------------------------------------------------
c:\test>cl float.c /Fefloatv6.exe
Microsoft (R) 32-bit C/C++ Standard Compiler Version 13.00.9466 for 80
+x86
Copyright (C) Microsoft Corporation 1984-2001. All rights reserved.

float.c
Microsoft (R) Incremental Linker Version 7.00.9466
Copyright (C) Microsoft Corporation.  All rights reserved.

/out:floatv6.exe
float.obj

c:\test>floatv6
False True
[download]

Basically, when you coerce a double to a float before comparing it to a float, you need to force the compiler to store the coerced value as a float before doing the comparison. That's what my cmpFsFd() is doing. (Insert underscores to taste :)

The reasoning is that it is only when the values are stored to memory (moved out of the FP registers), that the actual rounding/truncation occurs. Whilst values remain within the FP registers they are maintained as 80-bit FP values, regardless of whether they originate as 32-bit or 64-bit FPs.

The v8 (and presumably other compilers) do the coercion ((float)nv), by storing and and reloading to a temporary 32-bit memory location:

; 15   :    if( foo == (float)nv ) printf("True\n");

    fld    QWORD PTR _nv$[ebp]  ## Load nv onto FPU stack
    fstp    DWORD PTR tv79[ebp]  ## store (and pop) it into a 32-bit (
+float) temporary
    fld    DWORD PTR tv79[ebp]  ## load it back onto the FPU stack
    fld    DWORD PTR _foo$[ebp] ## load foo onto the FPU stack
    fucompp                      ## do the comparison
    fnstsw    ax                   ## get the FPU status word into AX
    test    ah, 68             ## 00000044H (Check for equality?)
    jp    SHORT $LN2@main      ## Jump 
    push    OFFSET $SG2485       ## or not ...
    call    _printf
[download]

The equivalent code generated by the V6 compiler omits that store & load step:

; 15   :    if( foo == (float)nv ) printf("True\n");

    fld    QWORD PTR _nv$[ebp]    ## Load nv to FPU stack
    fst    DWORD PTR tv78[ebp]    ## Store it to a temporary but...
        *** NEVER LOADS IT BACK *** 
        *** And does the comparison between the FPU register and the m
+emory image of foo ***
    fcomp    DWORD PTR _foo$[ebp]
    fnstsw    ax
    test    ah, 68                    ; 00000044H
    jp    SHORT $L800
    push    OFFSET FLAT:$SG801
    call    _printf
[download]

On the v7 compiler, you might get away with using /fp:strict or /fp:precise, but the v6 compiler lacks these options. (For the same reason, I haven't been able to check that theory!)

Maybe someone can come up with a preprocessor macro to map (float)x to something like ( float tmp = (float)d )? (Some of the macros in the Perl sources seem to do equally obscure things, but they fairly make my skin crawl :)

Personally, I'd prefer using cmpFsFd(), and perhaps an editor macro (with manual yea/nay) to change the sources. If the function was marked inline, it might not impose to much of a performance penalty, but you might have to be careful that the compiler doesn't optimise the tmp var away.

Anyway, I hope that is of some use to you.

Reference: http://webster.cs.ucr.edu/AoA/Windows/HTML/RealArithmetica2.html

11.2.5 Conversions The FPU performs all arithmetic operations on 80 bit real quantities. In a sense, the FLD and FST/FSTP instructions are conversion instructions as well as data movement instructions because they automatically convert between the internal 80 bit real format and the 32 and 64 bit memory formats.

Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.

"Science is about questioning the status quo. Questioning authority".

In the absence of evidence, opinion is indistinguishable from prejudice.

RIP PCW

Comment on Re: [Win32, C, and way OT] C floats, doubles, and their equivalence Select or Download Code

Replies are listed 'Best First'.
Re^2: [Win32, C, and way OT] C floats, doubles, and their equivalence by ikegami (Patriarch) on Jul 19, 2009 at 16:31 UTC
but you might have to be careful that the compiler doesn't optimise the tmp var away. Use `volatile` to force the compiler to avoid optimising the memory lookup away.	[reply] [d/l]
Re^3: [Win32, C, and way OT] C floats, doubles, and their equivalence by syphilis (Archbishop) on Jul 20, 2009 at 00:44 UTC
Use volatile to force the compiler to avoid optimising the memory lookup away I think the above suggestion is made in relation to the use of a separate function - which is not the method I've adopted. (I've made use of a temp variable, but it's in the body of the function itself, rather than in a separate function.) I find that declaring my temp variable as "volatile" doesn't help me. If I get rid of the `#pragma optimize()` calls, and declare the temp variable as `volatile float temp`, the problem remains. Obviously, "volatile" doesn't turn off every kind of optimization, and certainly doesn't turn off the kind of optimization that it needs to (in my case). Cheers, Rob	[reply] [d/l] [select]
Re^4: [Win32, C, and way OT] C floats, doubles, and their equivalence by ikegami (Patriarch) on Jul 20, 2009 at 07:43 UTC
If your code counters what I said, show it. Are you saying that the following works: `#include <stdio.h> cmpFsFd( float s, double d ) { float tmp = (float)d; return s == tmp ? 1 : 0; } int main(void) { double nv = 2.0 / 3; float foo = 2.0 / 3; if( foo == nv ) printf("True "); else printf("False "); if( cmpFsFd( foo, nv )) printf("True\n"); else printf("False\n"); return 0; }` [download] And that the following doesn't? `#include <stdio.h> int main(void) { double nv = 2.0 / 3; float foo = 2.0 / 3; volatile float nv_as_float = (float)nv; if( foo == nv ) printf("True "); else printf("False "); if( foo == nv_as_float ) printf("True\n"); else printf("False\n"); return 0; }` [download] That makes no sense to me.	[reply] [d/l] [select]
Re^5: [Win32, C, and way OT] C floats, doubles, and their equivalence by syphilis (Archbishop) on Jul 20, 2009 at 10:14 UTC
Re^5: [Win32, C, and way OT] C floats, doubles, and their equivalence by syphilis (Archbishop) on Jul 20, 2009 at 09:53 UTC
Re^2: [Win32, C, and way OT] C floats, doubles, and their equivalence by syphilis (Archbishop) on Jul 19, 2009 at 06:10 UTC
I think your detailed investigation demonstrates why the code I posted in repsonse to creamygoodness's post behaves the way it does. After much poking and scratching and re-reading of suggestions that have been kindly and thoughtfully presented here, I've eventually come up with using this approach in the PDL code: `#include <stdio.h> #if defined _MSC_VER && _MSC_VER < 1400 #pragma optimize("", off) #endif int main(void) { #if defined _MSC_VER && _MSC_VER < 1400 double nv = 2.0 / 3; float foo = 2.0 / 3; float dummy = (float)nv; if(foo == dummy) printf("True "); else printf("False "); #else double nv = 2.0 / 3; float foo = 2.0 / 3; if(foo == (float)nv) printf("True "); else printf("False "); #endif return 0; }` [download] It seems to be doing the right thing with all of my compilers. I don't think that C script needs to have the optimization turned off - the creation of the dummy variable is alone sufficient to get the behaviour I'm after. But for some reason, in the PDL code, creation of the dummy variable is not, by itself, sufficient - optimization also needs to be disabled. (I turn it off for the setvaltobad functions, then turn it back on again.) Thanks to all who replied. Cheers, Rob	[reply] [d/l]
Re^3: [Win32, C, and way OT] C floats, doubles, and their equivalence by BrowserUk (Patriarch) on Jul 19, 2009 at 07:28 UTC
But for some reason, in the PDL code, creation of the dummy variable is not, by itself, sufficient - optimization also needs to be disabled. (I turn it off for the setvaltobad functions, then turn it back on again.) That's why I moved the temp var and comparison into a separate function; it forces the compiler to use the temp value from memory for the comparison: `; 5 : return s == tmp ? 1 : 0; fld DWORD PTR _s$[ebp] fcomp DWORD PTR _tmp$[ebp] fnstsw ax test ah, 68 ; 00000044H jp SHORT $L809` [download] The problem with the test script is that with optimisations enabled, the newer compiler is able to reduce the whole script to a simple `printf( "False" ); printf( "True" ); return 0;` (even with the use of the sub) as everything is known at compile time: ; Listing generated by Microsoft (R) Optimizing Compiler Version 15.00 +.30729.01 TITLE C:\test\float.c .686P .XMM include listing.inc .model flat INCLUDELIB LIBCMT INCLUDELIB OLDNAMES _DATA SEGMENT $SG2527 DB 'True ', 00H ORG $+2 $SG2529 DB 'False ', 00H ORG $+1 $SG2531 DB 'True', 0aH, 00H ORG $+2 $SG2533 DB 'False', 0aH, 00H _DATA ENDS PUBLIC _cmpFsFd EXTRN __fltused:DWORD ; Function compile flags: /Ogtpy ; File c:\test\float.c ; COMDAT _cmpFsFd _TEXT SEGMENT tv135 = 8 ; size = 4 _s$ = 8 ; size = 4 _d$ = 12 ; size = 8 _cmpFsFd PROC ; COMDAT ; 4 : float tmp = (float)d; ; 5 : return s == tmp ? 1 : 0; fld DWORD PTR _s$[esp-4] fld QWORD PTR _d$[esp-4] fstp DWORD PTR tv135[esp-4] fld DWORD PTR tv135[esp-4] fucompp fnstsw ax test ah, 68 ; 00000044H jp SHORT $LN3@cmpFsFd mov eax, 1 ; 6 : } ret 0 $LN3@cmpFsFd: ; 4 : float tmp = (float)d; ; 5 : return s == tmp ? 1 : 0; xor eax, eax ; 6 : } ret 0 _cmpFsFd ENDP _TEXT ENDS PUBLIC _main EXTRN _printf:PROC ; Function compile flags: /Ogtpy _TEXT SEGMENT _main PROC ; 11 : double nv = 2.0 / 3; ## ; 12 : float foo = 2.0 / 3; ## All this and ... ; 13 : ; 14 : if( foo == nv ) printf("True "); ## ; 15 : else printf("False "); push OFFSET $SG2529 call _printf ; 16 : ; 17 : if( cmpFsFd( foo, nv ) ) printf("True\n"); ## this are op +timised away! push OFFSET $SG2531 call _printf add esp, 8 ; 18 : else printf("False\n"); ; 19 : ; 20 : return 0; xor eax, eax ; 21 : } ret 0 _main ENDP _TEXT ENDS END [download] Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error. "Science is about questioning the status quo. Questioning authority". In the absence of evidence, opinion is indistinguishable from prejudice. RIP PCW	[reply] [d/l] [select]
Re^4: [Win32, C, and way OT] C floats, doubles, and their equivalence by syphilis (Archbishop) on Jul 19, 2009 at 11:52 UTC
That's why I moved the temp var and comparison into a separate function Having a separate function is appealing, but not so straightforward to implement. It would be fine if we had an xs file to fiddle with, but the fact that the source file to be amended is a pd file (not an xs file) adds some complexity to the problem. I think it is possible to use the "separate function" approach - though it's actually "separate functions" (plural), as the templating dictates that we'll need separate functions to handle each of the different data types (ie byte, short, long, long long, double - not just float). There's also the issue of the second arg that gets supplied to the "separate function" - it could be a UV or an IV, not necessarily an NV, so we need to accommodate that as well (probably not difficult). It's a much simpler solution if one instead just adds a dummy variable and turns off optimization - which seems to work quite well. Cheers, Rob	[reply]