One of the problems I've had in the past, is a need to walk a filesystem and 'batch up' files. There's a variety of reasons why - things like archiving, virus scanning, etc. Now, you _could_ do it the heavyweight way - collect a full tree directory structure, batch up that way. This didn't suit my needs - I've a billion ish files to inspect, and they change rather frequently.

So as a workaround - make use of File::Find and it's ability to prune

#!/usr/bin/env perl use strict; use warnings; use File::Find; my $start_from = "/path/to/search/some_dir/beneath"; my $count = 10_000; #how many files to grab in this 'batch'; my @file_list; sub finder { if ( defined $start_from and not $found ) { #partial match, walk directory. if ( $start_from =~ m/\QFile::Find::name/ ) { $File::Find::prune = 0; if ( $File::Find::name =~ m/\Q$start_from/ ) { $found = 1; } } else { $File::Find::prune = 1; #don't traverse into this dir } } if ( @file_list > $limit ) { $found = 0; $File::Find::prune = 1; return; } return unless -f $File::Find::name; push ( @file_list, $File::Find::name ); #backtracks a bit to the start of the current directory $start_from = $File::Find::dir; } find ( \&finder, '/path/to/search' ); print "Next start point: $start_from\n";

Note - as it stands, this has a limiting factor in that it'll misbehaving if the directory structure changes (e.g. $start_from no longer exists. The workaround is chopping path elements off the end until you get to a dir that _does_ exist.

Probably something like:

while ( not -d $start_from and $start_from =~ m,/, ) { $start_from =~ s,/[^/]+$,,; }

(There's probably a better solution using File::Spec or similar)

Replies are listed 'Best First'.
Re: Restarting File::Find
by beech (Parson) on Nov 24, 2015 at 23:11 UTC
    #!/usr/bin/perl -- use strict; use warnings; use Path::Tiny qw/ path /; print existingParent( 'C:/WINDOWS/roshambonotexist/bl/ah/di/blah' ),"\ +n"; print existingParent( 'Q:/a/b/c/d/e/f/g/h/i/j/k/l' ),"\n"; sub existingParent { my $dir = path( shift )->absolute; while( not $dir->exists ){ $dir = $dir->parent; last if $dir->is_rootdir; } return $dir if $dir->exists; return ""; } __END__ $ perl path-tiny-parent.pl C:/WINDOWS