Beefy Boxes and Bandwidth Generously Provided by pair Networks
We don't bite newbies here... much
 
PerlMonks  

Re^7: Influencing the Gconvert macro

by syphilis (Archbishop)
on Oct 02, 2020 at 03:55 UTC ( #11122466=note: print w/replies, xml ) Need Help??


in reply to Re^6: Influencing the Gconvert macro
in thread Influencing the Gconvert macro

That just leaves the question of whether we can give the compiler enough hints for it to come to the same conclusion, or whether we'd only be able to shut it up with a sledgehammer preprocessor directive.

I'm puzzled as to how/why this check is even being run.

I've just built perl-5.33.2 with the usual configure args , making no attempt to influence the setting of Gconvert.
But I've applied this patch to sv.c:
--- sv.c 2020-09-29 22:29:16.781395700 +1000 +++ sv.c_mod 2020-10-02 11:35:20.728840400 +1000 @@ -13115,7 +13115,7 @@ && intsize != 'q' ) { WITH_LC_NUMERIC_SET_TO_NEEDED_IN(in_lc_numeric, - SNPRINTF_G(fv, ebuf, sizeof(ebuf), precis) + PERL_UNUSED_RESULT(sprintf(ebuf, "%.*g", (int)pre +cis, (NV) fv)) ); elen = strlen(ebuf); eptr = ebuf;
That works fine but I'm not happy about the double-rounding that takes place when nvtype is 'double' 'long double'.
We really want fv to be an NV, not a long double.
And then we would need the sprintf() formatting to accommodate the nvtype - "g" versus "Lg".

UPDATE: Duh ... there is no double-rounding ... but I think I still need to attend to the issue of "g" or "Lg" formatting.

And it still produces that awful noise (see below my sig).
The command that produces that noise is:
cc -c -DPERL_CORE -fwrapv -fno-strict-aliasing -pipe -fstack-protector +-strong -I/usr/local/include -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS= +64 -std=c89 -O2 -Wall -Werror=pointer-arith -Wextra -Wc++-compat -Wwr +ite-strings -Werror=declaration-after-statement sv.c
So I've tried (unsuccessfully) to reproduce those warnings by compiling the following C program:
#include <stdio.h> int main(void) { char ebuf[127]; long double fv = 0.3L; int precis = 54; sprintf(ebuf, "%.*g", precis, (double) fv); printf("%s\n", ebuf); return 0; }
I compiled it by running the same command (minus the perl-specific "-D..." switches) and it compiles noiselessly.
So I guess that the noise must be introduced by something in those perl-specific switches.

Do you know how to reproduce the warnings when compiling that C script ?

Incidentally, AFAICS, that patch effectively removes Gconvert from the perl source entirely - except for Win32API-File, where the Gconvert call in cpan\Win32API-File\const2perl.h could be replaced with sprintf(), anyway.
For Windows, Gconvert is already hard coded to sprintf().

Cheers,
Rob
In file included from sv.c:32:0: sv.c: In function ‘Perl_sv_vcatpvfn_flags’: sv.c:13118:54: warning: ‘%.*g’ directive writing between 1 and 133 byt +es into a region of size 127 [-Wformat-overflow=] PERL_UNUSED_RESULT(sprintf(ebuf, "%.*g", (int)pre +cis, (NV) fv)) ^ perl.h:6791:13: note: in definition of macro ‘WITH_LC_NUMERIC_SET_TO_N +EEDED_IN’ block; + \ ^~~~~ sv.c:13118:21: note: in expansion of macro ‘PERL_UNUSED_RESULT’ PERL_UNUSED_RESULT(sprintf(ebuf, "%.*g", (int)pre +cis, (NV) fv)) ^ sv.c:13118:54: note: assuming directive output of 132 bytes PERL_UNUSED_RESULT(sprintf(ebuf, "%.*g", (int)pre +cis, (NV) fv)) ^ perl.h:6791:13: note: in definition of macro ‘WITH_LC_NUMERIC_SET_TO_N +EEDED_IN’ block; + \ ^~~~~ sv.c:13118:21: note: in expansion of macro ‘PERL_UNUSED_RESULT’ PERL_UNUSED_RESULT(sprintf(ebuf, "%.*g", (int)pre +cis, (NV) fv)) ^ In file included from /usr/include/stdio.h:862:0, from perlio.h:41, from iperlsys.h:50, from perl.h:3934, from sv.c:32: /usr/include/x86_64-linux-gnu/bits/stdio2.h:33:10: note: ‘__builtin___ +sprintf_chk’ output between 2 and 134 bytes into a destination of siz +e 127 return __builtin___sprintf_chk (__s, __USE_FORTIFY_LEVEL - 1, ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ __bos (__s), __fmt, __va_arg_pack ()); ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ In file included from sv.c:32:0: sv.c:13118:54: warning: ‘%.*g’ directive writing between 1 and 133 byt +es into a region of size 127 [-Wformat-overflow=] PERL_UNUSED_RESULT(sprintf(ebuf, "%.*g", (int)pre +cis, (NV) fv)) ^ perl.h:6791:13: note: in definition of macro ‘WITH_LC_NUMERIC_SET_TO_N +EEDED_IN’ block; + \ ^~~~~ sv.c:13118:21: note: in expansion of macro ‘PERL_UNUSED_RESULT’ PERL_UNUSED_RESULT(sprintf(ebuf, "%.*g", (int)pre +cis, (NV) fv)) ^ sv.c:13118:54: note: assuming directive output of 132 bytes PERL_UNUSED_RESULT(sprintf(ebuf, "%.*g", (int)pre +cis, (NV) fv)) ^ perl.h:6791:13: note: in definition of macro ‘WITH_LC_NUMERIC_SET_TO_N +EEDED_IN’ block; + \ ^~~~~ sv.c:13118:21: note: in expansion of macro ‘PERL_UNUSED_RESULT’ PERL_UNUSED_RESULT(sprintf(ebuf, "%.*g", (int)pre +cis, (NV) fv)) ^ In file included from /usr/include/stdio.h:862:0, from perlio.h:41, from iperlsys.h:50, from perl.h:3934, from sv.c:32: /usr/include/x86_64-linux-gnu/bits/stdio2.h:33:10: note: ‘__builtin___ +sprintf_chk’ output between 2 and 134 bytes into a destination of siz +e 127 return __builtin___sprintf_chk (__s, __USE_FORTIFY_LEVEL - 1, ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ __bos (__s), __fmt, __va_arg_pack ());

Replies are listed 'Best First'.
Re^8: Influencing the Gconvert macro
by syphilis (Archbishop) on Oct 04, 2020 at 05:09 UTC
    Do you know how to reproduce the warnings when compiling that C script ?

    I didn't want to die wondering (even if it killed me), so I eventually came up with this C program that reproduces the warnings:
    /* try.c */ #include <stdio.h> #include <stdlib.h> void foo(double); int main(void) { /* The value assigned to 'd' has no * * effect on the warning message. */ double d = 0.; foo(d); } void foo(double d) { char buf[127]; sprintf (buf, "%.*g\n", 126, d); printf("%s\n", buf); }
    Build with: gcc -o try.exe try.c -Wformat-overflow
    That compilation produces the following noise:
    try.c: In function ‘foo’: try.c:17:17: warning: ‘%.*g’ directive writing between 1 and 133 bytes + into a region of size 127 [-Wformat-overflow=] sprintf (buf, "%.*g", 126, d); ^~~~ try.c:17:16: note: assuming directive output of 132 bytes sprintf (buf, "%.*g", 126, d); ^~~~~~ try.c:17:2: note: ‘sprintf’ output between 2 and 134 bytes into a dest +ination of size 127 sprintf (buf, "%.*g", 126, d); ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    which is essentially the same as the warnings I received when compiling perl.

    Apparently, the perl compilation process (with -O2 optimization) determines that the number of digits being requested is 126.
    It is then correctly calculated that the number of bytes written will be between 1 and 133 - which allows for the decimal point, a possible leading '-', a possible 'e', and a possible exponent of (maximum) size of 4.
    In those warnings, you'll see that the "1 and 133" changes to "2 and 134" - when the terminating NULL byte is included in the count.

    I haven't investigated just how the perl source compilation process makes the determination that the digit count has to be 126. It might just be a bug in the -O2 optimization - certainly, no warnings are emitted if the optimization level is reduced to less that -O2.

    As I mentioned previously, if the number of digits specified in the "%g" formatting is higher than 91, then the processing switches to a different block of code, so the buffer size of 127 is certainly large enough.
    I haven't looked into how or why that change occurs when digits > 91. It's not often that people will request more digits than 91 - so I'm not presently inclined to wade through the whys and wherefores of that processing path. It seems to be working correctly, and IMO that's good enough for now, at least.

    Update: Now that I understand how the warning is being created, I think it should be fairly simple to amend the perl source so that this warning is eliminated.

    Cheers,
    Rob
      I think it should be fairly simple to amend the perl source so that this warning is eliminated

      For anyone interested, here's the patch to sv.c that fixes the problem in a way that doesn't produce any warnings:

      Update: The patch meets the objective, but there's still scope for improvement:
      1) "précis + 7" is overkill. All that's needed to avoid the compilation warning is "précis + 1";
      2) the condition "&& float_need < sizeof(ebuf)" is not needed as it's implied by the next condition;
      3) the variable float_need is always set at 35 which feels wrong. AFAICS, "sizeof(ebuf) - float_need" always evaluates to 92 (ie 127 - 35).

      --- /cygdrive/c/comp/perl-5.33.2/sv.c 2020-10-04 18:15:27.028350200 ++1100 +++ sv.c 2020-10-04 18:13:55.508841200 +1100 @@ -13109,13 +13109,17 @@ /* check, in manner not involving wrapping, that it w +ill * fit in ebuf */ && float_need < sizeof(ebuf) - && sizeof(ebuf) - float_need > precis + && sizeof(ebuf) - float_need > precis + 7 && !(width || left || plus || alt) && !fill && intsize != 'q' ) { WITH_LC_NUMERIC_SET_TO_NEEDED_IN(in_lc_numeric, - SNPRINTF_G(fv, ebuf, sizeof(ebuf), precis) +#if defined(USE_LONG_DOUBLE) + PERL_UNUSED_RESULT(sprintf(ebuf, "%.*Lg", (int)pr +ecis, (NV) fv)) +#else + PERL_UNUSED_RESULT(sprintf(ebuf, "%.*g", (int)pr +ecis, (NV) fv)) +#endif ); elen = strlen(ebuf); eptr = ebuf;
      AFAICT changing précis to précis + 7 does not change functionality, and it's enough to eradicate the warning that the other changes would otherwise introduce
      Here's the patch to t/op/sprintf2.t that would have detected the original issue:
      --- sprintf2.t_orig 2020-10-04 18:24:56 +1100 +++ sprintf2.t 2020-10-04 18:24:37 +1100 @@ -1178,4 +1178,25 @@ is($off2, 1, "offset after 0"); } +# %g formatting was broken on Ubuntu, Debian and perhaps other system +s +# for a long time. Here we verify that no such breakage still exists. + +if($Config{nvsize} == 8) { + cmp_ok(sprintf("%.54g", 0.3), 'eq', '0.29999999999999998889776975 +3748434595 763683319091796875', + "sprintf( \"%.54g\", 0.3 ) renders correctly"); +} +elsif($Config{nvtype} eq 'long double' && ($Config{longdblkind} == 3 +|| $Config {longdblkind} == 4)) { + cmp_ok(sprintf("%.64g", 0.3), 'eq', '0.30000000000000000001084202 +1724855044 3400745280086994171142578125', + "sprintf( \"%.64g\", 0.3 ) renders correctly"); +} +elsif($Config{nvtype} eq 'long double' && $Config{longdblkind} >= 5 & +& $Config{ longdblkind} <= 8) { + # oops ... TODO (for double-double) +else { + cmp_ok(sprintf("%.115g", 0.3), 'eq', + '0.2999999999999999999999999999999999903703502780638207347 +2011028707 5363407309491758923059023800306022167205810546875', + "sprintf( \"%.115g\", 0.3 ) renders correctly"); +} + + done_testing();
      Update:I probably should point out that the "original issue" afflicts only perls whose nvsize is 8, AFAICT.
      Therefore, those tests that I've added to sprintf2.t for other types of NV are, in fact,, unrelated to that "original issue".

      Cheers,
      Rob
        - && sizeof(ebuf) - float_need > precis + && sizeof(ebuf) - float_need > precis + 7

        Thanks for digging. :) I suspect that the compiler is (quite reasonably) failing to understand that we check the size of the radix point and adjust float_need accordingly - that it assumes worst case we will not enter that branch (of which it in any case cannot reasonably understand the intent), and so its analysis acts as if we're always treating the radix point as width 1.

        It would probably be reasonable to make the above change (but probably folding it into float_need instead:

        - + 1 /* default radix point '.' */ + + 8 /* worst case radix point */
        ) and remove the later "float_need += (SvCUR(PL_numeric_radix_sv) - 1)". Though it's sad that this isn't enough to remove the need for the whole WITH_LC_NUMERIC_SET_TO_NEEDED_IN malarkey.

        I'll respond to the github issue separately, it'll take me some time to go through the options you describe.

        Hugo

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://11122466]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others chilling in the Monastery: (2)
As of 2022-01-29 10:49 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?
    In 2022, my preferred method to securely store passwords is:












    Results (74 votes). Check out past polls.

    Notices?