Laurent_R has asked for the wisdom of the Perl Monks concerning the following question:

Dear fellow monks, I just ran into the following problem running a Perl program under VMS.
perl -v This is perl, v5.8.6 built for VMS_IA64 (with 2 registered patches, see perl -V for more detail) Copyright 1987-2004, Larry Wall (...)
(You don't need to tell me that this is a very old version of Perl, I know, I agree, and there isn't much I can do about it.)

I have a small program trying to list the files on a disk partitioned by age and by size, with some typical code like this:

for my $entry (@files) { # @files contains the files I want to lo +ok at next if -z $entry; my $size = -s $entry; # more code
My problem is that for a dozen or so very large files, the $size variable takes a negative value. For example, in one specific case, I got the following value: -426648576. Note that the stat function returns the same negative value in field # 7 (size in bytes) and an empty string for field # 12 (size in blocks).

Obviously, the stat function and the -s file test operator are suffering from integer overflow.

Also note that POSIX's stat does not help and gives the same result.

I have found a work around: using a VMS system command within backticks to get the size blocks from VMS and then proceed with the necessary extraction and conversion. The specific file example above gives me a size of 7,555,310 blocks, which corresponds to about 3.8 GB (a VMS block is equal to 512 bytes). So, I am OK, my program does what I want with this work around.

I wonder, however, if there isn't a better way to get the size in pure Perl (i.e. without running a VMS system command within backticks), without having this integer overflow problem. Any idea?

Replies are listed 'Best First'.
Re: Integer overflow in -s or file stat results
by pryrt (Abbot) on Mar 07, 2017 at 15:41 UTC
    perl -V:sizesize -V:sizetype

    On my Win32 perl 5.24.0 x64, I get 8byte size_t. On my ancient CentOS 4.6 perl 5.8.5 (got you beat with how ancient a perl I'm stuck with is), I see 4byte size_t. If you convert your 4byte negative into a 4byte unsigned, or into a float using float(4byte signed integer) + 2**32, you'll get your 3.8GB size:

    perl -e 'printf "%08x => %.0f => %.3e\n", $_, $_+2**32, $_+2**32 for -426648576'

    I am thus betting you'll get -V:sizesize of 4 bytes, (edit:) but it will be a float rather than an int

    edit2: improve parens in the paragraph. Also, to base the conversion on the sizesize, use Config; $Config{sizesize}==4 ? convert($filesize): $filesize...

    edit3: clarify signed integer as the argument to float(); remove extra single-quote in edit2. Why didn't I see these before posting?

      Thank you very much, pryrt for these ideas.

      Unfortunately,

      perl -V:sizesize -V:sizetype
      does not produce anything useful under VMS (just the same output as Perl -V)

      Converting the negative output would work for the specific example I gave, but I am afraid it is probably not a reliable option because some other files are still significantly larger (10 GB, or so), so there is no way for me to know how many times I would need to add 2**32.

      Thanks anyway for your answer.

Re: Integer overflow in -s or file stat results
by BrowserUk (Patriarch) on Mar 07, 2017 at 17:00 UTC

    It'll probably suffer the same flaw, but you could try POSIX::lseek with SEEK_END and an offset of 0. If successful it returns the byte offset of the position it reached.

    If it was really important to avoid system, you could step through the file using seek() with SEEK_CUR and a relative offset of 2**31 (remembering how many steps) and reading a byte until the read fails; then step back to the last good position and do a binary chop until you find the last position at which you can read a byte.

    All in all, system is almost certainly preferable.


    With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority". The enemy of (IT) success is complexity.
    In the absence of evidence, opinion is indistinguishable from prejudice.
      Thank you, BrowserUk for your help.

      POSIX's lseek does indeed suffer of the same flaw and returns a negative integer.

      I guess you're right: unless some other monk comes up with a miraculous solution, I'll stick with with a call to the VMS system.

      Thank you for your help.

      Update: fixed a typo.

Re: Integer overflow in -s or file stat results
by Anonymous Monk on Mar 07, 2017 at 14:57 UTC
      Says it's not implemented in VMS, but you might try calling syscall() with no args just to see what happens. Other than that, I think you'd need to use XS.
        Thank you for your answer.

        It confirms that syscall is not implemented with a one-liner test:

        The syscall function is unimplemented at -e line 1.
        I will not try to write XS code under VMS, my knowledge of the VMS internals is much too poor. I guess I'll have to live with the work around described earlier.

        Thank you anyway for your effort.