in reply to get the size of a LARGE file

I need to figure out the size of large files (several GB). A simple -s results in an overflow as you can see in my logged session. Is my perl broken?

This may be a Perl problem, or it may be an OS large file rough-edge. Unless the underlying stat provides a 64 bit size, the best you're going to be able to do is turn $blksize and $blocks into instances of Math::BigInt, and multiply the two use this to get the approximate file size.

(On second thought: if the file can get that large, the OS stat should provide a 64-bit interface. You probably need to update your Perl.)

Replies are listed 'Best First'.
Re: Re: get the size of a LARGE file
by Anonymous Monk on Apr 25, 2002 at 08:13 UTC

    The underlying stat structure has an 64 bit st_size. But the idea with $blksize and $blocks works even w/o Math::BigInt. I may even get the accurate size through modulus of the broken negative size.

    Thank you!

    --Björn

Re: Re: get the size of a LARGE file
by Anonymous Monk on Apr 25, 2002 at 08:24 UTC

    Just one thing I forgot: st_blksize is not the size of one block, but the preferred I/O block size for the file system.

    So you have to get the size of one block somewhere else.

    --Björn

      Hi,
      beside the fact that st_blksize is not the blocksize but only the system preferred bloksize, there is a much more upsetting problem. Multiplying the number of blocks by the size of one block does not give you the size of the file. See the following example, where the file a just contains 4 carcters.

      @a = stat "a"; print "Real size = $a[7]\n"; print "blocksize * blocknum = ",$a[11]*$a[12],"\n";
      prints
      size = 4 blocksize * blocknum = 32768
      and everyone will agree that a file size function which returns 32768 instead of 4 is quite wrong.

      Of course, one my object that 32768 is the space that the file is taking up on your HD, but this is another story, as Kipling would have said.

      Cheers
      Leo TheHobbit
        Multiplying the number of blocks by the size of one block does not give you the size of the file.

        When the filesize is up in the gigabytes, the bytes lost by multiplying file blocks by disk blocksize are down in the noise. Given the uses that file sizes of this magnitude are put to (e.g., how much tape is needed to back this file up; how long will it take to move this file from A to B, and how much disk space will it need when it gets there), losing a few bytes doesn't matter.