One of the problems I've had in the past, is a need to walk a filesystem and 'batch up' files. There's a variety of reasons why - things like archiving, virus scanning, etc. Now, you _could_ do it the heavyweight way - collect a full tree directory structure, batch up that way. This didn't suit my needs - I've a billion ish files to inspect, and they change rather frequently.
So as a workaround - make use of File::Find and it's ability to prune
#!/usr/bin/env perl use strict; use warnings; use File::Find; my $start_from = "/path/to/search/some_dir/beneath"; my $count = 10_000; #how many files to grab in this 'batch'; my @file_list; sub finder { if ( defined $start_from and not $found ) { #partial match, walk directory. if ( $start_from =~ m/\QFile::Find::name/ ) { $File::Find::prune = 0; if ( $File::Find::name =~ m/\Q$start_from/ ) { $found = 1; } } else { $File::Find::prune = 1; #don't traverse into this dir } } if ( @file_list > $limit ) { $found = 0; $File::Find::prune = 1; return; } return unless -f $File::Find::name; push ( @file_list, $File::Find::name ); #backtracks a bit to the start of the current directory $start_from = $File::Find::dir; } find ( \&finder, '/path/to/search' ); print "Next start point: $start_from\n";
Note - as it stands, this has a limiting factor in that it'll misbehaving if the directory structure changes (e.g. $start_from no longer exists. The workaround is chopping path elements off the end until you get to a dir that _does_ exist.
Probably something like:
while ( not -d $start_from and $start_from =~ m,/, ) { $start_from =~ s,/[^/]+$,,; }
(There's probably a better solution using File::Spec or similar)
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re: Restarting File::Find
by beech (Parson) on Nov 24, 2015 at 23:11 UTC |