TASdvlper has asked for the wisdom of the Perl Monks concerning the following question:

Hi all,

I was wondering if there was a quick Perl script or a command line perl utility to parse all files, from the directory you are in and all child directories, and print out all files found from largest to smallest (showing what directory they are in).

Basically, my goal here is to watch the capacity size of my O/S partition, and when it starts to get full, I would like to flag was files are the largest and delete/archive them to free up space. There are hundreds of directories to potentially go threw, so doing something by hand is not the most efficient.

Take care and thanks all.

  • Comment on Parsing current and sub-directories and prints out all files found from largest size to smallest

Replies are listed 'Best First'.
Re: Parsing current and sub-directories and prints out all files found from largest size to smallest
by tachyon (Chancellor) on Apr 13, 2004 at 01:31 UTC
    #!/usr/bin/perl use File::Find; my $dir = $ARGV[0] || '.'; find( sub { -f and $h{$File::Find::name} = -s }, $dir ); print "$h{$_}\t$_\n" for sort { $h{$b}<=>$h{$a} } keys %h;

    For a slightly prettier version....

Re: Parsing current and sub-directories and prints out all files found from largest size to smallest
by kvale (Monsignor) on Apr 13, 2004 at 00:43 UTC
    I can't point to any scripts like this offhand.

    If I were writing this script, I'd use File::Find to traverse all the directories, -s $file to get the file size, store the results to a hash, and sort by size after the traversal is done.

    -Mark

Re: Parsing current and sub-directories and prints out all files found from largest size to smallest
by tachyon (Chancellor) on Apr 13, 2004 at 01:54 UTC
    ls -l `find ./* -type f` | sort -nrk 5 | head -10

    cheers

    tachyon

Re: Parsing current and sub-directories and prints out all files found from largest size to smallest
by graff (Chancellor) on Apr 13, 2004 at 01:38 UTC
    If the disk in question is being backed up on a regular basis, and if the backup system involved creates complete logs or maintains a database, it would be neatest if you could make use of that sort of resource, since it's there anyway, and is complete, accurate and up-to-date.

    If there are no backups (ooooh, living on the edge, are we?), or if you don't have easy access to backup logs or database, look for the the unix "find" utility -- it will do everything you want, once you learn the command line usage. (There is a windows port of the tool, if that's what you need.) You could write the equivalent tool in Perl, using the File::Find module, but it'll be more work to set it up, it'll run slower, and it'll consume more system resources while it's running.

    (Update: tachyon has just disproved the part about it being more work to set up -- or at least, the point is moot, since he's done the work; but it's still true that File::Find requires more run-time and memory than doing the equivalent job with "find".)

    For a relevant discussion of using "find" with Perl (which can be easy, fast and effective), check out this snippet (shameless plug): An alternative to File::Find

Re: Parsing current and sub-directories and prints out all files found from largest size to smallest
by pbeckingham (Parson) on Apr 13, 2004 at 00:02 UTC

    Given that you have posted no code, I suggest you look at the opendir, readdir, and closedir functions, paying attention to the return values from readdir, and take a stab at it. Give it a try - make us proud.