Pruning directory searches with File::Find

annie has asked for the wisdom of the Perl Monks concerning the following question:

Replies are listed 'Best First'.

Re: Pruning directory searches with File::Find
by broquaint (Abbot) on Jul 25, 2003 at 19:40 UTC

File::Find::Rule

use File::Find::Rule;

my @files = find(
  directory =>
  not_name  => qr/^_/,
  in        => @ARGV  ## or whatever
);

for(@files) {
  ...
}
[download]

File::Find::Rule

_________ broquaint

[reply]
[d/l]

Re: Re: Pruning directory searches with File::Find

by skyknight (Hermit) on Jul 25, 2003 at 20:07 UTC

It would seem to me that that particular module's paradigm for extracting files from the file system could pose some serious memory issues if the rules that you specify result in returning nearly all of the files in the tree you specified, which may certainly be the case if you're specifying just a lenient "not" rule for pruning things out. This is akin to slurping an entire file instead of reading it line by line. Often you can get away with it as the file will be of a reasonable length, but sometimes you'll get burned when you try to blast an enormous file into memory. Slurp a short config file, and nobody will notice; slurp a SQL transaction log that hasn't been rotated recently and you could bring the system to its knees. Caveat Slurpor.

[reply]

Re: Re: Re: Pruning directory searches with File::Find

by broquaint (Abbot) on Jul 25, 2003 at 23:52 UTC

It would seem to me that that particular module's paradigm for extracting files from the file system could pose some serious memory issues if the rules that you specify result in returning nearly all of the files in the tree you specified

SQL

SELECT *

SQL

use File::Find::Rule;

my $dir_rule = rule(
  directory =>
  not_name  => qr/^_/,
  start     => @ARGV,  ## or whatever
);

while(my $dir = $dir_rule->match) {
  ...
}
[download]

_________ broquaint

[reply]
[d/l]

Re: Pruning directory searches with File::Find
by bluto (Curate) on Jul 25, 2003 at 20:08 UTC

if (-d _ and /^_/) {
    $File::Find::prune = 1;
    return;
}
[download]

Update: Or you may want to use "-d $_" instead.

bluto

[reply]
[d/l]

Re^2: Pruning directory searches with File::Find

by particle (Vicar) on Jul 25, 2003 at 20:57 UTC

actually, you should use:

use File::Spec 'catfile';

## later, in sub wanted...
if( -d catfile( $File::Find::dir, $_) 
    && m/\A_/)
{
    $File::Find::prune= 1;
    return;
}
[download]

you need to specify the absolute path to the file system object you're accessing. $_ stores the name relative to the current search directory within File::Find. also, File::Spec will give you the platform independence you secretly crave ;P

but, overall, i'd still suggest broquaint's method. File::Find::Rule makes code like this easier to code, understand, and maintain.

~Particle *accelerates*

[reply]
[d/l]
[select]

Re: Re^2: Pruning directory searches with File::Find

by bluto (Curate) on Jul 25, 2003 at 21:03 UTC

bluto

[reply]

Re: Pruning directory searches with File::Find
by skyknight (Hermit) on Jul 25, 2003 at 19:57 UTC

I don't think that File::Find is going to let you throw away directory subtrees that you don't like. You could put code into your wanted() method that will ignore files with a path of the form that you describe, but you're going (to the best of my understanding) end up having File::Find waste time by walking through whole subtrees that you'd rather ignore. You might try using the following idiom in the place of File::Find to accomplish what you want...

use strict;
use Cwd;

my $cwd = Cwd::getcwd();
my $directory = shift(@ARGV) || $cwd;
$directory = $cwd . '/' . $directory unless $directory =~ /^\//;
my @queue = ($directory);

while (@queue) {
    my $node = shift(@queue);

    if (-d $node) {
        opendir(DIR, $node);
        push(@queue, map { $node . '/' . $_ } 
                     grep { $_ ne '.' and $_ ne '..' and $_  !~ /^_/ }
                     readdir(DIR));
        closedir(DIR);
    }
    else {
        do_your_stuff($node);
    }
}
[download]

This will do a depth first search on either the current working directory, or the directory that you specify on the command line, and it will ignore all subtrees of directories beginning with _. I hope this helps, and I hope it doesn't turn out to be a horribly convoluted way of doing it if there is an easier way with File::Find.

Update: It was come to my attention that exploitation of the $File::Find::prune variable is a much more expeditious way of accomplishing a pruning. I confess! I confess! Now stop minus one-ing me, I admit the error of my ways.

[reply]
[d/l]

Re: Pruning directory searches with File::Find
by PodMaster (Abbot) on Jul 26, 2003 at 01:38 UTC

find( {
  preprocess => sub { my @foo = grep { ! /^_whatever/  } @_; @foo; },
  wanted => \&wanted,
}, 'rhebarb' );
[download]

MJD says "you can't just make shit up and expect the computer to know what you mean, retardo!"
I run a Win32 PPM repository for perl 5.6.x and 5.8.x -- I take requests (README).
** The third rule of perl club is a statement of fact: pod is sexy.

[reply]
[d/l]