in reply to Re: Find duplicate files with exact same files noted
in thread Find duplicate files with exact same files noted

Wow! I didn't realize that I was traversing the tree twice until you said something. Maybe that is why it took a little while to run. I didn't use your exact suggestion, but I did merge the two pieces into one.

This ...

my @file_list; sub files_wanted { my $text = $File::Find::name; if ( -f ) { push @file_list, $text; } } find(\&files_wanted,$directory); my %files; for my $raw_file (@file_list) { my @file_parts = split(/\//,$raw_file); my $file = pop @file_parts; my $file_size = -s $raw_file; push @{$files{"$file ($file_size bytes)"}}, $raw_file; }

.. is now this ...

my %files; sub files_wanted { my $raw_file = $File::Find::name; if ( -f ) { my ($volume,$directories,$file) = File::Spec->splitpath($raw_file) +; #update from a prior suggestion. my $file_size = -s $raw_file; push @{$files{"$file ($file_size bytes)"}}, $raw_file; } } find(\&files_wanted,$directory);

The script now runs a little faster since removing the double traversal of the directory tree. Thanks for showing me what I was really doing!

Have a cookie and a very nice day!
Lady Aleena

Replies are listed 'Best First'.
Re^3: Find duplicate files with exact same files noted
by jwkrahn (Abbot) on Aug 17, 2010 at 20:31 UTC
    my %files; sub files_wanted { my $raw_file = $File::Find::name; if ( -f ) { my ($volume,$directories,$file) = File::Spec->splitpath($raw_file) +; #update from a prior suggestion. my $file_size = -s $raw_file; push @{$files{"$file ($file_size bytes)"}}, $raw_file; } }

    While you are in the "wanted" subroutine that File::Find::find runs, the full path is in the $File::Find::name variable and the file name only is in the $_ variable so there is no need to use File::Spec->splitpath() to do something that File::Find::find has already done for you.    Also, you are still using stat on the same file twice when it would be more efficient to do it only once.