Using File::Find will likely be faster simply because File::Find chdir()s into each directory as it recurses so that you are doing things like stat("file.txt") instead of stat("root/subdir/subsubdir/file.txt") which has to at least parse that path every time and probably traverse each of the directories mentioned each time.
Another way to make your code faster is to use the special stat target of _ which lets you get more data about the same file without making Perl call stat over and over.
The trick with File::Find is how to share the variables between related calls to your "wanted" subroutine while not sharing them between unrelated calls to your "wanted" subroutine.
You could do something very similar to what you have above with:
and then rip out most of your "pathstat" and rename it "filestat":find( sub { filestat( "ignored", $file_count, $dir_count, $total_size, $aged_file_count, $aged_total_size ); }, $pathname );
but it is possible to clean that up much more.sub filestat { if (-d $_) { if ($_ ne "." && $_ ne "..") { ++$_[2]; } } else { ++$_[1]; $_[3] += -s _; my $file_age= (-C _); if ($file_age >= $lowrange && $file_age <= $highrange) { ++$_[4]; $_[5] += -s _; } } }
If going for maximal speed, I'd probably make that code a bit easier to read and maintain by using symbolic constants instead of literal 1 through 5:
sub iFileCount() { 0; } sub iDirCount() { 1; } sub iTotalSize() { 2; } sub iAgedFileCount() { 3; } sub iAgedTotalSize() { 4; } find( sub { filestat( $file_count, $dir_count, $total_size, $aged_file_count, $aged_total_size ); }, $pathname ); sub filestat { my($file_age); my($file_size); if (-d $_) { if ($_ ne "." && $_ ne "..") { ++$_[iDirCount]; } } else { ++$_[iFileCount]; $file_size = (-s _); $_[iTotalSize] += $file_size; $file_age = (-C _); if ($file_age >= $lowrange && $file_age <= $highrange) { ++$_[iAgedFileCount]; $_[iAgedTotalSize] += $file_size; } } }
You could also consider using File::Recurse which has some niceties over File::Find [ but maybe isn't being maintained anymore? ): ].
You could probably make your own code faster even than File::Find code by reworking it to use chdir (and the "-x _" trick) since File::Find will often have to stat a file but your "wanted" routine can't tell when File::Find has already stated it so you have to stat each file and you end up with you and File::Find both stating the files much of the time.
- tye (but my friends call me "Tye")In reply to (tye)Re: faster filesystem stats
by tye
in thread faster filesystem stats
by Anonymous Monk
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |