in reply to Simple Recursion
Here's a cleaned up version that should work; it provides a few extras and some changes that you might find useful, in addition to possibly speeding things up, making it more flexible, etc.
Some points worth noting:use strict; use Carp; sub readin_dir { my ( $filenames, $paths, $ext ) = @_; if ( ref($filenames) ne 'ARRAY' or ref($paths) ne 'ARRAY' or @$pat +hs == 0 ) { carp( "readin_dir call lacks array_ref(s) for file names and/o +r paths\n" ); return; } for my $path ( @$paths ) { $path =~ s{/+$}{}; # don't need or want trailing slash(es) opendir( my $dh, $path ) or do { warn "readin_dir: open failed for $path\n"; next } +; my @subdirs = (); while ( my $file = readdir( $dh )) { next if ( $file =~ /^\.{1,2}$/ ); if ( -d "$path/$file" ) { push @subdirs, "$path/$file"; } elsif ( $ext eq '' or $file =~ /\.$ext$/ ) { push @$filenames, "$path/$file"; } } closedir( $dh ); readin_dir( $filenames, \@subdirs, $ext ) if ( @subdirs ); } }
Here's a little benchmark that compares the recursive function against both File::Find and a unix "find" command being opened as a file handle. The recursive function came out slowest for me, taking about twice as long as unix "find"; File::Find ended up surprisingly close to (not so much slower than) unix "find" in my case (perl 5.8.8 on macosx).
(One difference I noticed was that the recursive sub ended up following symbolic links that caused it to count some files twice, whereas File::Find only counted files once, and unix find -- given its default usage -- did not follow symlinks at all. I'm a bit dismayed at having to use global variables inside the File::Find "wanted" function, but apart from that, it does "the right thing" reasonably well.)
#!/usr/bin/perl use strict; use Benchmark; use File::Find (); my @found; my $ext = ''; if ( @ARGV >= 2 and $ARGV[0] eq '-e' ) { shift; $ext = shift; } my @paths = ( @ARGV ) ? @ARGV : ( "." ); die "Usage: $0 [-e ext] path ...\n" unless ( -d $paths[0] ); timethese( 10, { '2File::Find' => \&try_File_Find, '1Readin_dir' => \&try_readin_dir, '0Unix_find' => \&try_unix_find, } ); sub try_File_Find { @found = (); File::Find::find( { wanted => \&wanted, # follow_fast => 1, }, @paths ); print "File::Find found ".scalar @found." matches:\n"; } sub try_readin_dir { @found = (); readin_dir( \@found, \@paths, $ext ); print "readin_dir found ".scalar @found." matches\n"; } sub try_unix_find { @found = (); my $cmd = "find @paths -type f"; open( FIND, "-|", $cmd ); while (<FIND>) { chomp; push @found, $_ if ( $ext eq '' or /\.$ext$/ ); } print "unix_find found ".scalar @found." matches\n"; #, join( "\n" +,@found,"" ); } sub wanted { push @found, $File::Find::name if ( $ext eq '' or /\.$ext$/ ); } sub readin_dir { my ( $filenames, $paths, $ext ) = @_; for my $path ( @$paths ) { $path =~ s{/+$}{}; # don't need or want trailing slash(es) opendir( my $dh, $path ) or do { warn "readin_dir: open failed + for $path\n"; next }; my @subdirs = (); while ( my $file = readdir( $dh )) { next if ( $file =~ /^\.{1,2}$/ ); if ( -d "$path/$file" ) { push @subdirs, "$path/$file"; } elsif ( $ext eq '' or $file =~ /\.$ext$/ ) { push @$filenames, "$path/$file"; } } closedir( $dh ); readin_dir( $filenames, \@subdirs, $ext ) if ( @subdirs ); } }
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re^2: Simple Recursion
by mcsonka (Initiate) on Dec 03, 2007 at 23:51 UTC |