Beefy Boxes and Bandwidth Generously Provided by pair Networks
Syntactic Confectionery Delight

Re^8: Influencing the Gconvert macro

by syphilis (Archbishop)
on Oct 04, 2020 at 05:09 UTC ( [id://11122528]=note: print w/replies, xml ) Need Help??

in reply to Re^7: Influencing the Gconvert macro
in thread Influencing the Gconvert macro

Do you know how to reproduce the warnings when compiling that C script ?

I didn't want to die wondering (even if it killed me), so I eventually came up with this C program that reproduces the warnings:
/* try.c */ #include <stdio.h> #include <stdlib.h> void foo(double); int main(void) { /* The value assigned to 'd' has no * * effect on the warning message. */ double d = 0.; foo(d); } void foo(double d) { char buf[127]; sprintf (buf, "%.*g\n", 126, d); printf("%s\n", buf); }
Build with: gcc -o try.exe try.c -Wformat-overflow
That compilation produces the following noise:
try.c: In function ‘foo’: try.c:17:17: warning: ‘%.*g’ directive writing between 1 and 133 bytes + into a region of size 127 [-Wformat-overflow=] sprintf (buf, "%.*g", 126, d); ^~~~ try.c:17:16: note: assuming directive output of 132 bytes sprintf (buf, "%.*g", 126, d); ^~~~~~ try.c:17:2: note: ‘sprintf’ output between 2 and 134 bytes into a dest +ination of size 127 sprintf (buf, "%.*g", 126, d); ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~
which is essentially the same as the warnings I received when compiling perl.

Apparently, the perl compilation process (with -O2 optimization) determines that the number of digits being requested is 126.
It is then correctly calculated that the number of bytes written will be between 1 and 133 - which allows for the decimal point, a possible leading '-', a possible 'e', and a possible exponent of (maximum) size of 4.
In those warnings, you'll see that the "1 and 133" changes to "2 and 134" - when the terminating NULL byte is included in the count.

I haven't investigated just how the perl source compilation process makes the determination that the digit count has to be 126. It might just be a bug in the -O2 optimization - certainly, no warnings are emitted if the optimization level is reduced to less that -O2.

As I mentioned previously, if the number of digits specified in the "%g" formatting is higher than 91, then the processing switches to a different block of code, so the buffer size of 127 is certainly large enough.
I haven't looked into how or why that change occurs when digits > 91. It's not often that people will request more digits than 91 - so I'm not presently inclined to wade through the whys and wherefores of that processing path. It seems to be working correctly, and IMO that's good enough for now, at least.

Update: Now that I understand how the warning is being created, I think it should be fairly simple to amend the perl source so that this warning is eliminated.


Replies are listed 'Best First'.
Re^9: Influencing the Gconvert macro
by syphilis (Archbishop) on Oct 04, 2020 at 10:39 UTC
    I think it should be fairly simple to amend the perl source so that this warning is eliminated

    For anyone interested, here's the patch to sv.c that fixes the problem in a way that doesn't produce any warnings:

    Update: The patch meets the objective, but there's still scope for improvement:
    1) "précis + 7" is overkill. All that's needed to avoid the compilation warning is "précis + 1";
    2) the condition "&& float_need < sizeof(ebuf)" is not needed as it's implied by the next condition;
    3) the variable float_need is always set at 35 which feels wrong. AFAICS, "sizeof(ebuf) - float_need" always evaluates to 92 (ie 127 - 35).

    --- /cygdrive/c/comp/perl-5.33.2/sv.c 2020-10-04 18:15:27.028350200 ++1100 +++ sv.c 2020-10-04 18:13:55.508841200 +1100 @@ -13109,13 +13109,17 @@ /* check, in manner not involving wrapping, that it w +ill * fit in ebuf */ && float_need < sizeof(ebuf) - && sizeof(ebuf) - float_need > precis + && sizeof(ebuf) - float_need > precis + 7 && !(width || left || plus || alt) && !fill && intsize != 'q' ) { WITH_LC_NUMERIC_SET_TO_NEEDED_IN(in_lc_numeric, - SNPRINTF_G(fv, ebuf, sizeof(ebuf), precis) +#if defined(USE_LONG_DOUBLE) + PERL_UNUSED_RESULT(sprintf(ebuf, "%.*Lg", (int)pr +ecis, (NV) fv)) +#else + PERL_UNUSED_RESULT(sprintf(ebuf, "%.*g", (int)pr +ecis, (NV) fv)) +#endif ); elen = strlen(ebuf); eptr = ebuf;
    AFAICT changing précis to précis + 7 does not change functionality, and it's enough to eradicate the warning that the other changes would otherwise introduce
    Here's the patch to t/op/sprintf2.t that would have detected the original issue:
    --- sprintf2.t_orig 2020-10-04 18:24:56 +1100 +++ sprintf2.t 2020-10-04 18:24:37 +1100 @@ -1178,4 +1178,25 @@ is($off2, 1, "offset after 0"); } +# %g formatting was broken on Ubuntu, Debian and perhaps other system +s +# for a long time. Here we verify that no such breakage still exists. + +if($Config{nvsize} == 8) { + cmp_ok(sprintf("%.54g", 0.3), 'eq', '0.29999999999999998889776975 +3748434595 763683319091796875', + "sprintf( \"%.54g\", 0.3 ) renders correctly"); +} +elsif($Config{nvtype} eq 'long double' && ($Config{longdblkind} == 3 +|| $Config {longdblkind} == 4)) { + cmp_ok(sprintf("%.64g", 0.3), 'eq', '0.30000000000000000001084202 +1724855044 3400745280086994171142578125', + "sprintf( \"%.64g\", 0.3 ) renders correctly"); +} +elsif($Config{nvtype} eq 'long double' && $Config{longdblkind} >= 5 & +& $Config{ longdblkind} <= 8) { + # oops ... TODO (for double-double) +else { + cmp_ok(sprintf("%.115g", 0.3), 'eq', + '0.2999999999999999999999999999999999903703502780638207347 +2011028707 5363407309491758923059023800306022167205810546875', + "sprintf( \"%.115g\", 0.3 ) renders correctly"); +} + + done_testing();
    Update:I probably should point out that the "original issue" afflicts only perls whose nvsize is 8, AFAICT.
    Therefore, those tests that I've added to sprintf2.t for other types of NV are, in fact,, unrelated to that "original issue".

      - && sizeof(ebuf) - float_need > precis + && sizeof(ebuf) - float_need > precis + 7

      Thanks for digging. :) I suspect that the compiler is (quite reasonably) failing to understand that we check the size of the radix point and adjust float_need accordingly - that it assumes worst case we will not enter that branch (of which it in any case cannot reasonably understand the intent), and so its analysis acts as if we're always treating the radix point as width 1.

      It would probably be reasonable to make the above change (but probably folding it into float_need instead:

      - + 1 /* default radix point '.' */ + + 8 /* worst case radix point */
      ) and remove the later "float_need += (SvCUR(PL_numeric_radix_sv) - 1)". Though it's sad that this isn't enough to remove the need for the whole WITH_LC_NUMERIC_SET_TO_NEEDED_IN malarkey.

      I'll respond to the github issue separately, it'll take me some time to go through the options you describe.


        I suspect that the compiler is (quite reasonably) failing to understand that we check the size of the radix point and adjust float_need accordingly

        I had initially assumed that was being checked by the original code but, of course, it was not.
        I think you are correct. Having now included such a check, the warning disappears.
        That check, as it appears in the 3 patches I posted today to is:
        && sizeof(ebuf) - precis > 10
        That was to allow for a leading sign, a leading zero, a decimal point, an exponent (up to 5 bytes), and the terminating NULL byte ... which adds up to 9.
        But I think the maximum additional bytes required is only 8. (If that leading zero is present, then there will be no exponent. Right ?)
        I decided to allow for an extra exponent byte so that long doubles were accommodated, in case they ever get included in the process. That took it up to 10 bytes and I added one more for safety.
        I now believe (even allowing for long doubles) I could have got away with:
        && sizeof(ebuf) - precis > 8
        I won't lose too much sleep over that.

        As regards float_need, it was always at its initial value of 35 (at least inside the block of code we're discussing).
        I could see no sense in looking at it.
        I had initially wondered why the processing fell through to a different block of code whenever precis >= 92. It was because 127 - 35 == 92.
        We can better determine whether the string will fit into the buffer by looking at the sizeof ebuf and the value of precis .

        Thanks for the feedback and help.
        I tend to take a very blinkered approach to fixing bugs, concentrating on as narrow a field as is needed to get the job done. So it's certainly advisable that someone undertakes a more open and measured appraisal of what I'm proposing.

        I updated today via gmail ... but I don't see the 3 patch files there. (I'll send them to the github thread once I've sent this off and checked that they really are not there.)


Log In?

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://11122528]
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others browsing the Monastery: (None)
    As of 2024-04-25 00:55 GMT
    Find Nodes?
      Voting Booth?

      No recent polls found