Preceptor has asked for the wisdom of the Perl Monks concerning the following question:
So far, my code using 'File::Find' is pretty much doing what I want.
#!/usr/bin/perl use strict; use warnings; use File::Find; use threads; my $size_threshold = 5 * 1024 * 1024; #point at which to abort directo +ry travers ing. my @dirs = ( "/nas/fs001", "/nas/fs002", "/nas/fs003", "/nas/fs003" +, "/nas/fs004", "/nas/fs005", "/nas/fs006", "/nas/fs007", "/nas/fs008", "/nas/fs009", "/nas/fs010", "/nas/fs011", "/nas/fs012", "/nas/fs013", "/nas/fs014" ); @dirs = ( "/usr/local/apache" ); @dirs = ( "/nas/fs001", "/nas/fs002" ); #"/usr/local/apache"; my $customer_file = "/usr/local/apache/htdocs/dusage/disk_usage.conf"; my $max_depth = 7; my $debug = 1; sub dusage { my $dir = pop; # 1 arg only, because that lets me thread. my $tsize; my %rtree; my $datafile = $dir; $datafile =~ s,/,,g; print "Opening $datafile for output"; open ( OUTPUT, ">$datafile.csv" ) or die $!; find ( sub { if ( -f && ! -l ) { my $filesize = -s $_; $tsize += $filesize; #chop up the path, populate rtree at each of traverse_depth lev +els my @directory_structure = split ( '/', $File::Find::name ); pop(@directory_structure); # we'll never want the trailing fi +lename for ( my $depth = 0; $depth <= $max_depth; $depth++) { if ( $#directory_structure < $depth ) { next }; my $thispath = join ( '/', @directory_structure[0..$depth]) +; $rtree{$thispath} += $filesize; } } } , $dir ); foreach my $key ( keys ( %rtree ) ) { my $indent = ( $key =~ tr,/,, ); print OUTPUT $indent, ",", $key, ",", $rtree{$key},"\n"; } close (OUTPUT); } #main foreach my $directory ( @dirs ) { dusage ( $directory ); }
I've been looking at doing a 'thready' version, so I can run across these filesystems all at once.
e.g. changing that 'last bit' to:
Now, this just doesn't work. The reason as far as I can tell, is that File::Find defines itself globally, so stuff I do within the each thread mutually clobbers each other. (The 'threading' works fine if I only use one thread, as far as I can tell).my %threads; foreach my $directory ( @dirs ) { print "starting $directory search thread\n"; $threads{$directory} = threads -> new ( \&dusage,$directory ); } foreach my $directory ( @dirs ) { print "waiting for $directory collator thread to join..."; $threads{$directory} -> join; print "done.\n"; }
Is there an obvious/relatively painless way of doing what I want here? e.g. doing multiple File::Find's at once.
I appreciate I can quite easily just run multiple instances of this program, with different directory lists, and if there's no other solution, I'll try doing that, but I was hoping to be able to do a 'collect, collate, report' within a single bit of code.
Edit: Looks like what I'm looking for is the 'dont_chdir' option to File::Find. I've amended this, and will be re-running this to see how it works out.
|
|---|