http://qs1969.pair.com?node_id=11122385


in reply to Re^3: Influencing the Gconvert macro
in thread Influencing the Gconvert macro

I've seen such warnings in smoke logs a number of times, and tried to investigate at least a couple of times without success; I'd love (someone) to get to the bottom of it, at least to understand if it's a real problem (that needs to be fixed by increasing a buffer size).

I've never seen anyone report an actual problem relating to the warning though.

Hugo

Replies are listed 'Best First'.
Re^5: Influencing the Gconvert macro
by syphilis (Archbishop) on Oct 01, 2020 at 07:38 UTC
    I'd love (someone) to get to the bottom of it, at least to understand if it's a real problem (that needs to be fixed by increasing a buffer size).

    I can verify that the "%.*g" formatting still works ok when the buffer size needs to be bigger than 127.
    For example, with this patched perl, perl -le 'printf "%.751g\n", 2 ** - 1074;' prints out all 751 mantissa digits correctly and displays the exact decimal rendition of the value 2 ** -1074.
    $ ./perl -I./lib -le 'printf "%.751g\n", 2 ** - 1074;' 4.94065... another 740 digits ...65625e-324

    Actually, I can add a little to that.
    I hacked the source to print out the size of the buffer, and I've just discovered that, so long as I ask for no more than 91 digits, the buffer size is 127 - which should be sufficient.
    But as soon as I ask for more than 91 digits, the buffer size is not displayed - indicating that the processing has switched to a different block.
    Incredibly, when I ask for more than 91 digits, I've also just now realized that the "%.*g" formatting works fine on Ubuntu-18.04 perls. That is, the bug exists only when I request 18 to 91 (inclusive) digits.
    If I request a number outside of that range, it works fine on a standard (unpatched) perl-5.32.0 on Ubuntu-18.04:
    $ perl -le 'printf "%.91g\n", 2 ** -1074;' 4.9406564584124654e-324 $ perl -le 'printf "%.92g\n", 2 ** -1074;' 4.94065645841246544176568792868221372365059802614324764425585682500675 +50727020875186529983636e-324
    So it looks to me that the concern surrounding the 127-byte buffer is unfounded, because the processing switches to a different block as soon as we ask for more than 91 digits.
    But I'm not prepared to claim that I've actually proved anything ;-)

    Cheers,
    Rob

      Ok, the main calculation in the perl source is I think this statement from sv.c:

      /* Determine the buffer size needed for the various * floating-point formats. * * The basic possibilities are: * * <---P---> * %f 1111111.123456789 * %e 1.111111123e+06 * %a 0x1.0f4471f9bp+20 * %g 1111111.12 * %g 1.11111112e+15 * * where P is the value of the precision in the format, or + 6 * if not specified. Note the two possible output formats +of * %g; in both cases the number of significant digits is < += * precision. * * For most of the format types the maximum buffer size ne +eded * is precision, plus: any leading 1 or 0x1, the radix * point, and an exponent. The difficult one is %f: for a * large positive exponent it can have many leading digits +, * which needs to be calculated specially. Also %a is slig +htly * different in that in the absence of a specified precisi +on, * it uses as many digits as necessary to distinguish * different values. * * First, here are the constant bits. For ease of calculat +ion * we over-estimate the needed buffer size, for example by * assuming all formats have an exponent and a leading 0x1 +. * * Also for production use, add a little extra overhead fo +r * safety's sake. Under debugging don't, as it means we're * more likely to quickly spot issues during development. */ float_need = 1 /* possible unary minus */ + 4 /* "0x1" plus very unlikely carry */ + 1 /* default radix point '.' */ + 2 /* "e-", "p+" etc */ + 6 /* exponent: up to 16383 (quad fp) */ #ifndef DEBUGGING + 20 /* safety net */ #endif + 1; /* \0 */

      .. after which if we are subject to locale it goes and checks the actual length of the utf8 representation of the radix point and adjusts that "+ 1" for the default. The above adds up to 35, which is pretty close to the difference between 91 and 127.

      The origin of the gcc warning looks like it might be gimple-ssa-sprintf.c or a close relative, in which case the "#define target_mb_len_max() 6" may well explain the difference between 127 and 133.

      So this looks pretty safe to me - and you'd certainly need a debugging perl to get close to exercising the limits.

      That just leaves the question of whether we can give the compiler enough hints for it to come to the same conclusion, or whether we'd only be able to shut it up with a sledgehammer preprocessor directive.

      Hugo

        That just leaves the question of whether we can give the compiler enough hints for it to come to the same conclusion, or whether we'd only be able to shut it up with a sledgehammer preprocessor directive.

        I'm puzzled as to how/why this check is even being run.

        I've just built perl-5.33.2 with the usual configure args , making no attempt to influence the setting of Gconvert.
        But I've applied this patch to sv.c:
        --- sv.c 2020-09-29 22:29:16.781395700 +1000 +++ sv.c_mod 2020-10-02 11:35:20.728840400 +1000 @@ -13115,7 +13115,7 @@ && intsize != 'q' ) { WITH_LC_NUMERIC_SET_TO_NEEDED_IN(in_lc_numeric, - SNPRINTF_G(fv, ebuf, sizeof(ebuf), precis) + PERL_UNUSED_RESULT(sprintf(ebuf, "%.*g", (int)pre +cis, (NV) fv)) ); elen = strlen(ebuf); eptr = ebuf;
        That works fine but I'm not happy about the double-rounding that takes place when nvtype is 'double' 'long double'.
        We really want fv to be an NV, not a long double.
        And then we would need the sprintf() formatting to accommodate the nvtype - "g" versus "Lg".

        UPDATE: Duh ... there is no double-rounding ... but I think I still need to attend to the issue of "g" or "Lg" formatting.

        And it still produces that awful noise (see below my sig).
        The command that produces that noise is:
        cc -c -DPERL_CORE -fwrapv -fno-strict-aliasing -pipe -fstack-protector +-strong -I/usr/local/include -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS= +64 -std=c89 -O2 -Wall -Werror=pointer-arith -Wextra -Wc++-compat -Wwr +ite-strings -Werror=declaration-after-statement sv.c
        So I've tried (unsuccessfully) to reproduce those warnings by compiling the following C program:
        #include <stdio.h> int main(void) { char ebuf[127]; long double fv = 0.3L; int precis = 54; sprintf(ebuf, "%.*g", precis, (double) fv); printf("%s\n", ebuf); return 0; }
        I compiled it by running the same command (minus the perl-specific "-D..." switches) and it compiles noiselessly.
        So I guess that the noise must be introduced by something in those perl-specific switches.

        Do you know how to reproduce the warnings when compiling that C script ?

        Incidentally, AFAICS, that patch effectively removes Gconvert from the perl source entirely - except for Win32API-File, where the Gconvert call in cpan\Win32API-File\const2perl.h could be replaced with sprintf(), anyway.
        For Windows, Gconvert is already hard coded to sprintf().

        Cheers,
        Rob
        In file included from sv.c:32:0: sv.c: In function ‘Perl_sv_vcatpvfn_flags’: sv.c:13118:54: warning: ‘%.*g’ directive writing between 1 and 133 byt +es into a region of size 127 [-Wformat-overflow=] PERL_UNUSED_RESULT(sprintf(ebuf, "%.*g", (int)pre +cis, (NV) fv)) ^ perl.h:6791:13: note: in definition of macro ‘WITH_LC_NUMERIC_SET_TO_N +EEDED_IN’ block; + \ ^~~~~ sv.c:13118:21: note: in expansion of macro ‘PERL_UNUSED_RESULT’ PERL_UNUSED_RESULT(sprintf(ebuf, "%.*g", (int)pre +cis, (NV) fv)) ^ sv.c:13118:54: note: assuming directive output of 132 bytes PERL_UNUSED_RESULT(sprintf(ebuf, "%.*g", (int)pre +cis, (NV) fv)) ^ perl.h:6791:13: note: in definition of macro ‘WITH_LC_NUMERIC_SET_TO_N +EEDED_IN’ block; + \ ^~~~~ sv.c:13118:21: note: in expansion of macro ‘PERL_UNUSED_RESULT’ PERL_UNUSED_RESULT(sprintf(ebuf, "%.*g", (int)pre +cis, (NV) fv)) ^ In file included from /usr/include/stdio.h:862:0, from perlio.h:41, from iperlsys.h:50, from perl.h:3934, from sv.c:32: /usr/include/x86_64-linux-gnu/bits/stdio2.h:33:10: note: ‘__builtin___ +sprintf_chk’ output between 2 and 134 bytes into a destination of siz +e 127 return __builtin___sprintf_chk (__s, __USE_FORTIFY_LEVEL - 1, ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ __bos (__s), __fmt, __va_arg_pack ()); ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ In file included from sv.c:32:0: sv.c:13118:54: warning: ‘%.*g’ directive writing between 1 and 133 byt +es into a region of size 127 [-Wformat-overflow=] PERL_UNUSED_RESULT(sprintf(ebuf, "%.*g", (int)pre +cis, (NV) fv)) ^ perl.h:6791:13: note: in definition of macro ‘WITH_LC_NUMERIC_SET_TO_N +EEDED_IN’ block; + \ ^~~~~ sv.c:13118:21: note: in expansion of macro ‘PERL_UNUSED_RESULT’ PERL_UNUSED_RESULT(sprintf(ebuf, "%.*g", (int)pre +cis, (NV) fv)) ^ sv.c:13118:54: note: assuming directive output of 132 bytes PERL_UNUSED_RESULT(sprintf(ebuf, "%.*g", (int)pre +cis, (NV) fv)) ^ perl.h:6791:13: note: in definition of macro ‘WITH_LC_NUMERIC_SET_TO_N +EEDED_IN’ block; + \ ^~~~~ sv.c:13118:21: note: in expansion of macro ‘PERL_UNUSED_RESULT’ PERL_UNUSED_RESULT(sprintf(ebuf, "%.*g", (int)pre +cis, (NV) fv)) ^ In file included from /usr/include/stdio.h:862:0, from perlio.h:41, from iperlsys.h:50, from perl.h:3934, from sv.c:32: /usr/include/x86_64-linux-gnu/bits/stdio2.h:33:10: note: ‘__builtin___ +sprintf_chk’ output between 2 and 134 bytes into a destination of siz +e 127 return __builtin___sprintf_chk (__s, __USE_FORTIFY_LEVEL - 1, ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ __bos (__s), __fmt, __va_arg_pack ());