in reply to Re^3: [OT] LLP64 .v. LP64 portability
in thread [OT] LLP64 .v. LP64 portability

Shutting up the warnings would be useful (for Perl) if I was convinced they were all false alarms, but I'm semi-convinced that one or more of them is the reason behind some traps I experience when I get close to allocating 4GB of virtual memory.

For Parrot, which effectively equates sizeof( INTVAL ) == sizeof( void* ) all over the show, I'm convinced that the configuration utility (that actually warns of the problem), is doing the wrong thing with respect to how it defines the fundemental typedefs.

Personally, I think that on windows the following typedefs for STRLEN, IV and UV should be used. And most (if not all) uses of int, I32, and U32 should dropped in favour of one if the 3 above.

#include <stdio.h> #include <stddef.h> typedef uintptr_t UV; typedef ptrdiff_t STRLEN; typedef intptr_t IV; #ifdef _WIN64 #define IS_WIN64 "" #else #define IS_WIN64 "not" #endif void main( void ){ printf( "_WIN64 is %s defined\n", IS_WIN64 ); printf( "MAX: %d\n", _INTEGRAL_MAX_BITS ); printf( "size_t is %d bytes\n", sizeof( size_t ) ); printf( "int is %d bytes\n", sizeof( int ) ); printf( "long is %d bytes\n", sizeof( long ) ); printf( "long long is %d bytes\n", sizeof( long long ) ); printf( "UV is %d bytes\n", sizeof( UV ) ); printf( "IV is %d bytes\n", sizeof( IV ) ); printf( "STRLEN is %d bytes\n", sizeof( STRLEN ) ); }

Compiled for 32 & 64-bit targets this produces:

C:\test>size_t.exe _WIN64 is not defined MAX: 64 size_t is 4 bytes int is 4 bytes long is 4 bytes long long is 8 bytes UV is 4 bytes IV is 4 bytes STRLEN is 4 bytes C:\test>size_t.exe _WIN64 is defined MAX: 64 size_t is 8 bytes int is 4 bytes long is 4 bytes long long is 8 bytes UV is 8 bytes IV is 8 bytes STRLEN is 8 bytes

Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.
RIP an inspiration; A true Folk's Guy

Replies are listed 'Best First'.
Re^5: [OT] LLP64 .v. LP64 portability
by ikegami (Patriarch) on Apr 22, 2010 at 02:44 UTC
    I'm not familiar with those types. On paper, uintptr_t and intptr_t look good for UV and IV. However, STRLEN should remain whatever strlen returns. That is usually size_t, and it's accessed via Size_t.
      I'm not familiar with those types.

      They are standard types for microsoft compilers: MS CRT Standard types.

      STRLEN should remain whatever strlen returns. That is usually size_t, and it's accessed via Size_t.

      The problem is, as you pointed out above, size_t, (actually defined as what sizeof() returns), is an unsigned type, and therefore cannot handle negative indexing.

      And since (on 32-bit), it isn't possible to have strings longer than 2GB, to me it makes sense to avoid the need for casting between signed and unsigned, and all the noise that adds to the sources, by utilising the otherwise unused high-bit to accommodate both Perl's negative indexing, and general pointer math.

      ptrdiff_t (long integer or __int64, depending on the target platform) Result of subtraction of two pointers.

      Seems to be perfectly defined for this purpose.

      POSIX (though not ANSI or ISO) also define an equivalent type ssize_t for similar reasons.


      Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
      "Science is about questioning the status quo. Questioning authority".
      In the absence of evidence, opinion is indistinguishable from prejudice.

        Seems to be perfectly defined for this purpose.

        STRLEN is used for variables whose value will passed to the last argument of memcpy and for those that will receive the result of strlen. I don't see why it would be more suitable to use a different type than the actual type the functions use.

        The problem is, as you pointed out above, size_t, (actually defined as what sizeof() returns), is an unsigned type, and therefore cannot handle negative indexing.

        It does not need to handle negative indexing. The position and length are normalised before being stored into STRLEN vars, and the IV vars in which they stored as they are being normalised can already handle negative numbers. Putting the position and length into signed STRLEN vars instead of (signed) IV vars is not going to help simplify pp_substr any.

        There are three ways of simplifying pp_substr:

        • If the maximum string length is no bigger than IV_MAX on all platforms, the simplest solution is to add a range check at the top and treat the position and length as IV vars on out.

          if (SvIOK_UV(pos_sv) && (UV)pos_iv > (UV)IV_MAX) goto BOUND_FAIL; if (SvIOK_UV(len_sv) && (UV)len_iv > (UV)IV_MAX) len_iv = IV_MAX;
        • Dictate that the arguments to substr are limited to the range of IV instead of IV+UV. Tough luck if your system supports longer strings. The code would be identical to the code in the previous bullet.

        • Define a type that can hold the entire supported range of positions and lengths, reducing the supported range of positions and lengths on some platforms if necessary. Check the arguments against that type's maximum value, then use the that type for pos_iv and len_iv.