klink'on has asked for the wisdom of the Perl Monks concerning the following question:

the basic plot is, i need to find certain files within a directory tree, and when found, append the contents of all the found files into one giant file... with a couple of exceptions:

root dir: /home/tdawg/workspace/zoo

file(s) to find: list.dat

so i started with find2perl

use vars qw/*name *dir *prune/; *name = *File::Find::name; *dir = *File::Find::dir; *prune = *File::Find::prune; sub wanted; # Traverse desired filesystems File::Find::find({wanted => \&wanted}, '/home/tdawg/workspace/zoo'); exit; sub wanted {" /^list\.dat\z/s && print("$name\n"); }

now before trying to copy the contents to some giant file (>> giant.file) there are 2 exceptions

1)within the root directory, there is only a subset of those directories i want to traverse. I actually only want to search through /home/tdawg/workspace/zoo/abc1* directories. Of course putting a wildcard in there like that doesn't work. So if zoo contains the dirs:

abc1_dos

beta_dos

prod_tres

abc1

we will only traverse through the abc1_dos and abc1 subtrees.

The next exception is that we don't care about list.dat if its in certain sub-directories. we want to ignore list.dat if its under a dead or baby sub-directory.

/home/tdawg/workspace/zoo/abc1/adult/main/list.dat #yes!

/home/tdawg/workspace/zoo/abc1/dead/list.dat #ignore

/home/tdawg/workspace/zoo/abc1/biped/baby/list.dat #ignore

if someone can help me get to the point where I am working only with the list.dats that i want, it would be much appreciated. Various ways of trying to exclude things like /^baby/s within the wanted sub are not working for me.

Replies are listed 'Best First'.
Re: getting picky with File::Find
by choroba (Cardinal) on Apr 08, 2014 at 21:57 UTC
    To check that the path is of the form zoo/abc something, I used the preprocess option. To check for babies and zombies, I used a regular expression in the wanted subroutine.
    #!/usr/bin/perl use warnings; use strict; use File::Find qw{ find }; find({ wanted => \&wanted, preprocess => \&filter, },'/home/choroba/0'); sub wanted { return unless 'list.dat' eq $_ and $File::Find::dir !~ /(?: baby | + dead )$/x; warn $File::Find::name; } sub filter { return if $File::Find::dir =~ m{ zoo/(?!abc) }x; return @_ }
    لսႽ† ᥲᥒ⚪⟊Ⴙᘓᖇ Ꮅᘓᖇ⎱ Ⴙᥲ𝇋ƙᘓᖇ
Re: getting picky with File::Find
by RichardK (Parson) on Apr 08, 2014 at 23:02 UTC

    I think File::Find::Rule is easier to use so here's my attempt, using glob to get just the abc* directories. This ignores everything below baby/dead directories.

    use v5.18; use warnings; use autodie; use File::Find::Rule; my $rule = File::Find::Rule->or( File::Find::Rule->name('baby','dead') ->directory() ->prune() ->discard() , File::Find::Rule->file->name('list.dat') ); my @files = $rule->in( glob('abc*') ); say "****************"; say for @files;

      File::Find::Rule exports rule/find by default, so less typing ; I add find/rule to export list for documentation purposes

      #!/usr/bin/perl -- use strict; use warnings; use Data::Dump qw/ dd pp /; use File::Find::Rule qw/ rule find /; my @dirs = find( qw/ directory name abc* in . /); dd( \@dirs ); my $nobaby = rule( name => [ 'baby', 'dead' ], qw/ prune discard/ ); my $list = rule( file => name => 'list.dat', ); my @files = find( not => $nobaby , any => $list, in => \@dirs, ); dd( \@files ); @files = rule()->or( ## !IMPORTANT NOTE discard before prune with rule()/find() rule( name => [ 'baby','dead' ], qw/ directory discard prune /), rule( qw' file name list.dat ') )->in( find( qw/ directory name abc* in . maxdepth 1 /), ); dd( \@files ); __END__ $ perl ffffindrule.pl ["abcdefg"] ["abcdefg/coy/abu/list.dat", "abcdefg/coy/dabie/list.dat"] ["abcdefg/coy/abu/list.dat", "abcdefg/coy/dabie/list.dat"]
      $ findrule abcdefg abcdefg abcdefg/coy abcdefg/coy/abu abcdefg/coy/abu/list.dat abcdefg/coy/baby abcdefg/coy/baby/list.dat abcdefg/coy/dabie abcdefg/coy/dabie/list.dat abcdefg/coy/dead abcdefg/coy/dead/list.dat
Re: getting picky with File::Find::Rule
by Anonymous Monk on Apr 08, 2014 at 22:24 UTC
    Doesn't descent into baby directories, finds files named list.dat ... modify to what you need
    #!/usr/bin/perl -- use strict; use warnings; use Data::Dump qw/ dd /; use File::Find::Rule qw/ rule find /; my $thisdir = shift ; my $nobaby = rule( directory => name => 'baby', 'prune' ); my $list = rule( file => name => 'list.dat', ); my @files = find( not => $nobaby , or => $list, in => $thisdir, ); dd( \@files ); __END__
    $ tree -f -a coy coy |-- coy/abu | `-- coy/abu/list.dat |-- coy/baby | `-- coy/baby/list.dat `-- coy/dabie `-- coy/dabie/list.dat 3 directories, 3 files $ perl fffindrule.pl ["coy/abu/list.dat", "coy/dabie/list.dat"]
Re: getting picky with File::Find
by oiskuu (Hermit) on Apr 09, 2014 at 06:02 UTC

    And a unixy alternative:

    $ find /home/tdawg/workspace/zoo/abc1* \ -name dead -prune -o -name baby -prune -o \ -name list.dat -type f -print0 | xargs -0 cat > gigantofile.dat

Re: getting picky with File::Find
by klink'on (Initiate) on Apr 10, 2014 at 16:09 UTC
    After adding the new pm and adjusting @INC, I went with using File::Find::Rule. Thanks all!