comment on

Maybe the first step, limiting the search to only those directories in /home with uid >= 500, should just be done with readdir. Once you have the list of home directories to be searched, use File::Find on each of those in turn.

Or (to beat a favorite dead hobby horse of mine) use other tools to do something simple, like "du -k -s $dir" -- I'll bet this turns out to run faster and use less memory than File::Find.

I tried out the following, and I think it basically does what you're looking for. I didn't benchmark it against using File::Find to get the equivalent result for this case, but in other cases where I have done the benchmarking, File::Find consistently takes at least a few times longer than a solution that doesn't use it.

#!/usr/bin/perl

use strict;

# round up the usual suspects...

chdir "/home";
opendir( H, "." ) or die $!;
my @homers = grep { ( -d and (stat(_))[4] >= 500 ) } readdir H;
closedir H;

# track down their disk usage

open( SH, "| /bin/sh > /tmp/home.scan.$$" ) or die $!;
print SH "du -k -s $_\n" for ( @homers );
close SH;

# read and print the results from worst to nicest

open( U, "/tmp/home.scan.$$" ) or die $!;
my %usage = map { (/(\d+)\s+(\S+)/); $2=>$1 } <U>;
close U;

print "$_ : $usage{$_}\n" for ( sort { $usage{$b} <=> $usage{$a} } key
+s %usage );

# all done

unlink "/tmp/home.scan.$$";
exit(0);
[download]

(update: in case it's not clear, note that "du" does a recursive tally of space consumed by a given directory tree; by default, it lists all subdirectories and the total data contained within each -- the "-s" option turns off the detail and gives just a bottom-line total for the top-level path. Also, "du" does not follow symbolic links, whether these point to data files or other directories; I gather that was the intention of the OP, but it's not clear to me whether the OP's use of find would follow symlinks that point to directories.)

In reply to Re: File::find and skipping directories by graff
in thread File::find and skipping directories by neilwatson

Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!

Titles consisting of a single word are discouraged, and in most cases are disallowed outright.

Read Where should I post X? if you're not absolutely sure you're posting in the right place.

Please read these before you post! —

Posts may use any of the Perl Monks Approved HTML tags:

a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr

You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)

	For:		Use:
	&		`&`
	<		`<`
	>		`>`
	[		`[`
	]		`]`

Link using PerlMonks shortcuts! What shortcuts can I use for linking?

See Writeup Formatting Tips and other pages linked from there for more info.