http://qs1969.pair.com?node_id=24294


in reply to RE: What's eating all your disk space?
in thread What's eating all your disk space?

I like this script a lot; very handy. It was taking too long on some of my larger directory trees, though, so I took the liberty of speeding it up. The following does the sorting in Perl, and also calculates the sum internally to eliminate the '.' from the du call. This saves du from having to walk the directory tree twice (once for '.' and once for the individual '*' arguments) and sped things up a lot for me.
#! /usr/bin/env perl open(DU, "du -sk *|") || die "Can't exec du: $!\n"; while (<DU>) { ($size, $inode)=split; chop($size); $sum += $size; push @entries, { size => $size, inode => $inode }; } close(DU); @entries = sort { $b->{size} <=> $a->{size} } @entries; foreach $e (@entries[0 .. 10]) { printf("%30s | %5d | %2.2f%%\n",$e->{inode},$e->{size},$e->{siz +e}/$sum*1000); }
Thanks for a cool script!

Replies are listed 'Best First'.
RE: RE: RE: What's eating all your disk space?
by jjhorner (Hermit) on Jul 25, 2000 at 19:36 UTC

    I believe your script above has issues:

    • Why did you chop $size? You are only taking off the last digit. If you meant to take off the metric notation (G, M, or k) you could just s/G|M|k//o
    • STRICT and WARNINGS!
    • You have have a percentage multiplied by 1000. I believe you meant 100.
    • If you want the metric notations (G,M, or k), you will have to do some funny math to get them all to the same measurement (kilobytes).
    J. J. Horner
    Linux, Perl, Apache, Stronghold, Unix
    jhorner@knoxlug.org http://www.knoxlug.org/
    
      To adress each point:
      • That's a goof left over from development. Oops. Fixes the 3rd point you make as well.
      • I agree that using -w and 'use strict;' are excellent things to do. However, I intended this to be used on the command line either as an alias or shell script somewhere in the your $PATH. I want it to be faster, rather than slower. Granted, the differences are small, I admit. Blindly following the dogma of "must use strict and -w all the time" is dangerous, just as blindly following any dogma is. That said, I use strict; and -w almost always.
      • Noted.
      • I've already taken care of this, or did you not see the -k flag I passed to 'df'? This forces df to print it's numbers in 1Kb blocks instead of the system default block size (which is not always 1024 bytes). Since the numbers are in Kb, I don't have to worry about stripping any metrics, and the 'funny math' becomes trivial. Not all versions of 'du' support Linux's -h tag, but they do support -k.
      In any case, I'll make a few refinments...
RE: RE: RE: What's eating all your disk space?
by fundflow (Chaplain) on Jul 25, 2000 at 19:45 UTC
    I think that you meant $sum*100 in the end...
    :)

    Here is a small improvement. It adds a '/' to directory names,
    . . . ($size, $inode)=split; $inode .= "/" if (-d $inode); chop($size); . . .
    also, the printf should be changed. Instead of "%2.2f" you probably meant "%5.2f". The first digit is the total number of digits, including the period.
RE: RE: RE: What's eating all your disk space?
by hawson (Monk) on Jul 27, 2000 at 06:48 UTC
    I thought about building the sorting but decided against it ("One tool does one thing"). I figure that I'll leave sorting to 'sort'. ;-)

    As to 'df' walking the tree twice, I looked at that as well. In my tests, it looked like the results were cached somewhere, and thus 'df -sk . *' is quite fast. A prior version of the script did something horrible along the lines of: 'du -sk *|perl -e `$sum=du -sk .` while(<>) {....}, so this is an improvement already.

    This is pretty quick, and I use it on 40GB raid arrays. :-)