Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

Fellow Monks,

I am looking for a module that will return me the size of a file, on a win32 system. I would have thought finding such a module was easy enough, but I must just be looking in the wrong places in supersearch and cpan etc. Your help is much appreciated!

Many Thanks,
Martin

Replies are listed 'Best First'.
Re: filesize module on win32 systems
by stvn (Monsignor) on Mar 01, 2004 at 18:05 UTC
    $file_size = -s "File.txt";

    No module nessecary.

    -stvn
      There are caveats, of course:

      • NTFS supports transparent compression of files. There is the "actual" file size, and the "compressed" size. Which does -s return?
      • NTFS supports sparse files. If a large section of a file contains NULL, the filesystem can save space by not allocating room for the NULL data. Here, I would assume -s would return the full size, not the "on disk" size.
      • NTFS supports "alternate streams". A file can have many alternate streams, taking up data that I assume would never be reported by -s.
      • Depending on the file size & cluster size, most files probably take up more space "on disk" than the contents of the file would suggest. Which does -s report here?

      This is probably not an exhaustive list, I'm sure I'm forggeting something.

      Edited by Chady -- closed ul tag.

        • I'll make an educated guess about compressed files later.
        • Most every modern filesystem supports sparse files, and the call underlying -s reports the virtual, not on-disk, size, on most of them. I assume NTFS is no different. Programs have to specifically be "sparse-aware" to deal with such files properly.
        • The size of alternative streams is not reported by the call underlying -s. Again, programs need to be "alternate-stream-aware" to deal with this situation correctly.
        • It reports the byte size of the file, not the cluster size. Unless you are writing a lowlevel filesystem manipulation/report tool, you shouldn't concern yourself with this distinction in the first place.

        You'll notice that the defaults are such that a program naively copying the contents of a file to another with taking the existence of advanced filesystem features into account will still work (and that's on OS level, not Perl -s level). In light of that trend I'd suggest that -s reports the uncompressed size of NTFS compressed files.

        And that's probably all the OP needed, too.

        Makeshifts last the longest.

        Perl calls the C runtime fstat() to get the information. I the case of MSVC the actual call is _fstati64() which in turn calls GetFileInformationByHandle(). That returns a structure called BY_HANDLE_FILE_INFORMATION, which contains to DWORDS fields that contain the filesize. The filesize as it would be if you read the whole think into memory. Ie. decompressed, de-sparsed etc.

        To get the actual on-disk size you would need to call GetCompressedFileSize().


        Examine what is said, not who speaks.
        "Efficiency is intelligent laziness." -David Dunham
        "Think for yourself!" - Abigail
Re: filesize module on win32 systems
by revdiablo (Prior) on Mar 01, 2004 at 20:05 UTC

    As stvn wrote, you can use perl's builtin -s function to get a file's size. If you want to check out all the other dash functions perl has, you can run perldoc -f -X at a command prompt.

    One thing to keep in mind if you're doing multiple tests on one file, is you can pass _ in as the argument, and it will use the cached stat info from the previous test, alleviating the need to do yet another stat on the same file.

Re: filesize module on win32 systems
by Aristotle (Chancellor) on Mar 01, 2004 at 20:20 UTC
Re: filesize module on win32 systems
by bageler (Hermit) on Mar 02, 2004 at 23:45 UTC
    I found/modified this chunk of code when I wanted to find the total size of files matching a pattern in a directory, i.e. the total size of *.pm in ~/. I first used it on win32 in a perlTk program so I know it works well.
    package sizes; require 5.001; use strict; use vars qw(@ISA $VERSION); $VERSION = "1.0"; @ISA = qw(Exporter); sub new { my $self = {}; my $pkg = shift; $self->{dir} = shift; $self->{pat} = shift; $self->{dh} = undef; bless $self; opendir $self->{dh}, $self->{dir} || warn "cannot open dir: [".$self->{dir}."] $!\n"; return $self; } sub check { my $self = shift; $self->{size} = 0; while (my $file = readdir($self->{dh})) { next unless $file =~ /$self->{pat}/; my $fqpn = $self->{dir}.'/'.$file; print STDERR "size $fqpn: ".$self->{size}."\n"; $self->{size} += -s $fqpn; print STDERR "size $fqpn: ".$self->{size}."\n"; } print STDERR "rewinding\n"; rewinddir($self->{dh}); return $self->{size}; } sub DESTROY { my $self = shift; closedir $self->{dh}; } 1; use sizes; $checker = sizes->new($dir,$pat); $size=$checker->check();