in reply to Re^4: Weird behavior of int()
in thread Weird behavior of int()

Yeah that's sort of what I'm going for. I feel like 'int' ought to have a stronger contract of doing what's written on the label. If it can't convert to an int, it ought to return undef or 0, not just ignore the attempt, under the Principle of Least Astonishment.

Replies are listed 'Best First'.
Re^6: Weird behavior of int()
by syphilis (Archbishop) on May 22, 2024 at 02:24 UTC
    I feel like 'int' ought to have a stronger contract of doing what's written on the label.

    The documentation for int() says that it "Returns the integer portion of EXPR".
    It's using "integer" in the mathematical sense of "whole number", not in the programming sense of "IV".
    It just so happens that with the most commonly used builds of perl (where IV precision >= NV precision) every EXPR that (in numeric context) contains a fractional portion will also truncate to a value that fits into an IV ... so I can see how the confusion might arise.
    But, for example, we don't want int(2 ** 65) to start returning '0' or 'undef' or IV_MAX just because it's too big to fit into an IV.

    If you switch to a perl where IV precision < NV precision (such as perls whose IV size is 4 bytes, or whose NV type is __float128), then you'll encounter lots of cases where EXPR contains values with a fractional portion, and yet has an integer portion that's too big to fit into an IV.
    On those builds, and for those values of EXPR, int(EXPR) will happily and silently return just the "whole number" portion of EXPR as an NV.

    Cheers,
    Rob
      I guess the unspoken implied behavior of a function named "int()" can be interpreted more ways than I expected. When I use "int()" in perl, what I actually wanted was C's cast-to-int or JavaScript's parseInt(), followed by making assumptions that I have an integer safe for any integer purposes.

      Having learned about these edge cases, now I need to go back through all the controllers I've written and fix:

      my $count= int($c->request->params->{count}); $c->detach(HTTP => 422, ["Invalid count"]) if $count < 0 || $count > $limit;
      because passing NaN to the controller would let a non-integer value leak through to the code beyond.
        When I use "int()" in perl, what I actually wanted was C's cast-to-int or JavaScript's parseInt()

        I don't know of any perl function that will do that.
        I'd do it as an XSub:
        SV * _to_IV(SV * in) { if(SvNV(in) < 0) return newSViv(SvIV(in)); return newSVuv(SvUV(in)); }
        However, in your case, assuming that $limit > ~0 >> 1, you could just do it as:
        IV _to_IV(SV * in) { return SvIV(in); }
        which (if I'm thinking correctly) would still reject arguments greater than ~0 >> 1 because $count is negative.

        Of course, you might prefer to just check that the arg is not a NaN, if you trust the NaN != NaN test or POSIX::isnan or somesuch.

        Cheers,
        Rob
Re^6: Weird behavior of int()
by pryrt (Abbot) on May 21, 2024 at 21:39 UTC
    I feel like 'int' ought to have a stronger contract of doing what's written on the label. If it can't convert to an int, it ought to return undef or 0, not just ignore the attempt

    int doesn't contract to return an IV; it contracts to return the integer portion of the expression given. An infinitely large floating point would convert to an infinitely large integer, so infinity is still the logical conclusion (IMO).

    In my mind, perl's int is equivalent to c's math.h:'trunc', not the int typecast in c.

    #include <math.h> #include <stdio.h> int main() { double x; x = 3.14; printf("trunc(%f) = %f\n", x, trunc(x)); x = -2.718; printf("trunc(%f) = %f\n", x, trunc(x)); x = +INFINITY; printf("trunc(%f) = %f\n", x, trunc(x)); x = -INFINITY; printf("trunc(%f) = %f\n", x, trunc(x)); x = +NAN; printf("trunc(%f) = %f\n", x, trunc(x)); x = -NAN; printf("trunc(%f) = %f\n", x, trunc(x)); }
    output:
    trunc(3.140000) = 3.000000 trunc(-2.718000) = -2.000000 trunc(inf) = inf trunc(-inf) = -inf trunc(nan) = nan trunc(nan) = nan

    The docs for int could be improved to say what it does on those edge cases (the trunc docs that I linked do explicitly define those behaviors), but I personally think that int returns the right thing for those inputs, and for me, it is the Answer of Least Astonishment.

      Your particular trunc might have that behaviour, but trunc general might not.

      The spec simply describes trunc as follows:

      The trunc functions round their argument to the integer value, in floating format, nearest to but no larger in magnitude than the argument.

      The spec simply describes what it returns as follows:

      The trunc functions return the truncated integer value.

      Passing an infinity or a NaN is therefore undefined behaviour. Anything could happen.

        The spec

        Which spec?

        Unless I read this wrong, at least IEEE Std 1003.1-2017 (ie, the POSIX standard) defines NaN and Inf behavior for trunc . As did the older IEEE Std 1003.1-2008 edition.

        And while not every implementation will follow POSIX, I don't feel I am wrong in expecting that following the behavior in an IEEE standard is reasonable behavior. And if it's reasonable behavior, then I don't feel I was wrong in arguing that the behavior makes sense for perl's int as well.

        (And from what I remember of IEEE754 -- though it's been a while since I've read the relevant portions -- NaN was supposed to propagate if you try to do the various mathematical operators or functions on it. So my argument is in keeping with the spirit of that spec (at least as far as I remember it).)

        update: I looked it up when I got to work the next day. IEEE 754-2008 section 6.2 "Operations with NaNs": "For an operation with quiet NaN inputs, other than maximum and minimum operations, if a floating-point result is to be delivered the result shall be a quiet NaN which should be one of the input NaNs." And in 6.2.3 "NaN Propagation": "An operation that propagates a NaN operand to its result and has a single NaN as an input should produce a NaN with the payload of the input NaN if representable in the destination format." Floating-point NaNs are intended to propagate.