Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

I am attempting to create a no-frills way to get the directory size of all data within a give directory. I wish to go one step under the given directory for totals. For example, say I have the following directory tree (on Windows) under c:\temp:
c:\temp
 -dir1
  --some_file1
  --some_file2
 -dir2
  --some_file1
  --some file2
  --DIR_A
   ---some_file1
   ---some_file2
   ---DIR_B
    ----some_file

For the above, I am looking for output similar to:
dir1 = {size of dir1 and everything under it including contents of all subdirectories}
dir2 = {size of dir2 and everything under it including contents of all subdirectories}

I don't need totals for DIR_A and DIR_B as their totals should already be included in the total for dir2.

Here is the code I have so far:
use File::Find; my $dir = @ARGV[0]; find(\&do_dir, $dir); my %hash; sub do_dir { my $size = -s ($File::Find::name); my $dir = ($File::Find::dir); $hash{($File::Find::dir)} += -s ($File::Find::name); } foreach (keys %hash) { print "DIR $_ = $hash{$_}\n"; }
As you can see, I don't know how to limit it to just totaling one directory down?

Replies are listed 'Best First'.
Re: Directory size 1 level deep
by ishnid (Monk) on Apr 20, 2004 at 15:57 UTC
    Yet another way. Like pelagic's, you can set the level to stop at (defaults to 1, as requested). Personally, I've always preferred File::Find::Rule to File::Find.
    #!/usr/bin/perl -w use strict; use File::Find::Rule; my ($dir, $show_level) = @ARGV; $show_level ||= 1; for my $current(File::Find::Rule->directory->not_name( qr/^\.+$/)->max +depth($show_level)->in($dir)) { my $size = 0; $size += $_ for map -s, File::Find::Rule->file->in($current); print "DIR $current = $size\n"; }
Re: Directory size 1 level deep
by pelagic (Priest) on Apr 20, 2004 at 14:25 UTC
    Takes 2 parms: dir_init and show_level
    #!/usr/bin/perl use strict; my $debug = 0; my ($dir_init, $show_level) = @ARGV; my (@ls, @dirs, $size, $parent); my %dir_size = (); my %parent_of_dir = (); my %level_of_dir = (); my $sep = "~" x 80; my $form = "%20s %s\n"; my $files = 0; # Get the list of files using readdir. push(@dirs, $dir_init); $dir_size{$dir_init} = 0; $level_of_dir{$dir_init} = 0; foreach my $directory (@dirs) { print "\n\n------- starting directory scan -------- in $directory" + if ($debug); opendir (DIR, $directory) or die ("Can't open dir: '$directory'.\n +Reason: $!"); @ls = readdir(DIR); closedir (DIR); # Then go through the results of ls and work out the files.. FILE: foreach my $file (@ls) { # print "\n* * * * * $file" if ($debug); next if ($file =~ m/^\.\.?$/); $files++; SWITCH: { if (-d "$directory/$file") { print "\n$file\tis DIRECTORY" if ($debug); push(@dirs, "$directory/$file"); $level_of_dir{"$directory/$file"} = $level_of_dir{$dir +ectory} + 1; $parent_of_dir{"$directory/$file"} = $directory; $dir_size{"$directory/$file"} = 0; last SWITCH; } if (-l "$directory/$file") { print "\n$file is a symbolic link" if ($debug); last SWITCH; } if (-f "$directory/$file") { print "\n$file is a plain file" if ($debug); # ($dev,$ino,$mode,$nlink,$uid,$gid,$rdev,$size,$atime, +$mtime,$ctime,$blksize,$blocks)= stat "$directory/$file"; $size = (-s "$directory/$file"); print "\tsize: $size" if ($debug); $dir_size{"$directory"} += $size; $parent = $parent_of_dir{$directory}; # print "\ntry with parent as: $parent if ($debug); while ($parent) { $dir_size{$parent} += $size; $parent = $parent_of_dir{$parent}; # print "\ntry with parent as: $parent" if ($debug) +; } last SWITCH; } else { print "\n$file\tis of UNKNOWN type"; } } } } print "\n------- finished directory scans --------" if ($debug); print "Total number of files/directories examined: $files\n"; print $sep . "\n"; foreach my $k (sort keys %dir_size) { next if ($show_level && ($show_level < $level_of_dir{$k})); print "\t\t" x $level_of_dir{$k}; printf $form, commify($dir_size{$k}), $k; # printf $form, $dir_size{$k}, $k; } print $sep . "\n"; print "\n" if ($debug); foreach my $k (keys %parent_of_dir) { print "\n $k: $parent_of_dir{$k}" if ($debug); } sub commify { my $text = reverse $_[0]; $text =~ s/(\d\d\d)(?=\d)(?!\d*\.)/$1,/g; return scalar reverse $text; } __OUTPUT__ Total number of files/directories examined: 1489 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +~~~~~~~~~~ 58,192,935 P:\dev\perl 145,431 P:\dev\perl/0735712891Code 1,129,217 P:\dev\perl/Compress-Zlib-1.33 34,215,542 P:\dev\perl/DPL 507,956 P:\dev\perl/Regexp-Common-2.113 71,412 P:\dev\perl/albums 508,041 P:\dev\perl/calendar 65,996 P:\dev\perl/formalware 53,356 P:\dev\perl/guestbook 1,714 P:\dev\perl/hoch 12,139 P:\dev\perl/http 11,692,406 P:\dev\perl/modules 100,426 P:\dev\perl/oracle 31,036 P:\dev\perl/perl-xml-quickstart 758,171 P:\dev\perl/updf ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +~~~~~~~~~~

    pelagic
Re: Directory size 1 level deep
by sgifford (Prior) on Apr 20, 2004 at 15:52 UTC
    You need a way to transform the directory the file is in to the appropriate top-level directory. Use the File::Basename module, then something like:
    sub topdir { my($nd,$parent)=@_; my $d = $nd; # Hack; strip off trailing characters that might be directory # seperators $parent =~ s/\W$//; while ($nd ne $parent) { $d = $nd; $nd = File::Basename::dirname($d); } return $d; }
    That HACK line is in case the argument is c:\temp\ instead of just c:\temp. It should really be $dir=join("",fileparse($dir));, but that doesn't work as documented on my system.

    The code to use this is:

    #!/usr/bin/perl -w use strict; use File::Find; use File::Basename; our $dir = $ARGV[0]; find(\&do_dir, $dir); my %hash; sub do_dir { my $size = -s ($File::Find::name) or return; my $d = topdir($File::Find::dir,$dir); if (!defined($hash{$d})) { $hash{$d} = 0; } # warn "size=$size, d=$d\n"; $hash{$d} += $size; } foreach (keys %hash) { print "DIR $_ = $hash{$_}\n"; }
Re: Directory size 1 level deep
by EdwardG (Vicar) on Apr 20, 2004 at 19:36 UTC

    Least frills?

    d:\>du -h sqldata 743M sqldata/MSSQL/Data 65K sqldata/MSSQL/LOG 0 sqldata/MSSQL/JOBS 0 sqldata/MSSQL/BACKUP 0 sqldata/MSSQL/REPLDATA/FTP 0 sqldata/MSSQL/REPLDATA 743M sqldata/MSSQL 743M sqldata

    This gives you more than you ask for, but it is not much of a stretch to grep however you want.

    You can get a win32 version of du from this GNU project on sourceforge.