ppm has asked for the wisdom of the Perl Monks concerning the following question:

Hello,

i use File::Find to get a list of all files from a directory:

use File::Find; my $base = "c:/blub"; find( sub { push(@dateien, $File::Find::name) }, $base );
This works fine .. but how i've write the code to get only files with desired extensions, eg.
my $ext = "html|php";
Only this types should pushed into the list. And finaly i will exclude some directory, eg..
my $dirs = "/cgi-bin|log";
With a regex i would do this like
/\.($ext)$/i;
Any ideas how to do this for my code? thx a lot for your help!

ppm

Replies are listed 'Best First'.
Re: Need help with File::Find
by broquaint (Abbot) on Feb 26, 2003 at 15:27 UTC
    Out with the old and in with the new
    use File::Find::Rule; my $ext = "html|php"; my $dirs = "(?:cgi-bin|log)"; my $base = "c:/blub"; my @files = find(not => rule( directory => name => qr/^$dirs/ => prune => ), file => name => qr/\.(?:$ext)\z/, in => $base);
    See the File::Find::Rule docs for more info on this fabulous module.
    HTH

    _________
    broquaint

    update: code now corresponds to question
    update 2: added grouping to $dirs missing from original data
    update 3: fixed code re runrig's comments. sigh this just ain't my node

      Your RE binds to /cgi-bin and just log not /log. You need a (?: ) in there...

      cheers

      tachyon

      s&&rsenoyhcatreve&&&s&n.+t&"$'$`$\"$\&"&ee&&y&srve&&d&&print

      A few problems:
      • Your extensions don't match just the end of the file name.
      • You're missing a '=>' after the second 'name'.
      • The pruning rule doesn't seem to work in combination with the first 'file => name' rule (update: if the 'file => name' rule comes after the 'not ( ... prune )' rule), then the procedural answer works).
      • The dirs still are not right. you shouldn't have a leading slash, and you probably ought to anchor the regex.

      I couldn't get the prune to work with the procedural interface, so here's the OO style answer:

      #!/usr/bin/perl use strict; use warnings; use File::Find::Rule; use constant FFR => "File::Find::Rule"; my $ext = qr/\.(?:html|php)$/; my $dirs = qr/^(?:cgi-bin|log)$/; my $base = "c:/blub"; my @files = FFR->or( FFR->directory->name($dirs)->prune->discard, FFR->file->name($ext), )->in($base); print "$_\n" for @files;
      Update: The procedural interface works if you put the first 'file => name ...' rule AFTER the 'not ... => prune' rule (just like it is above in the OO interface).
Re: Need help with File::Find
by jasonk (Parson) on Feb 26, 2003 at 15:21 UTC
    find( sub { push(@datein, $File::Find::name) if /\.($ext)$/i; }, $base );
      Oh .. thx! I exclude in this code also all non ASCII Files:
      use File::Find; my $base = "c:/blub"; my $ext = "html|js"; find( sub { push(@dateien, $File::Find::name) if ( /\.($ext)$/i && -T); }, $base );
      Now i will also exclude some directories defined in a var
      my dirs = "cgi-bin|logs";
      Or bette a hash or array for this? For the dirs i will have a "or" and if this value are empty it should be ignored. Same for the extensions ..

      Thx again (sorry, i'm new to perl)

      ppm

Re: Need help with File::Find
by tachyon (Chancellor) on Feb 26, 2003 at 15:30 UTC
    use File::Find; my @exts = qw( jpg gif tar gz bat txt ); my @excludes = qw( winnt cygwin ); my $ext = join '|', map{quotemeta} @exts; my $exclude = join '|', map{quotemeta} @excludes; my $base = '/cygwin'; find( sub { push @datein, $File::Find::name if m/\.(?:$ext)$/oi and $File::Find::name !~ m!/(?:$exclude)!oi; }, $base ); print "$_\n" for @datein;

    cheers

    tachyon

    s&&rsenoyhcatreve&&&s&n.+t&"$'$`$\"$\&"&ee&&y&srve&&d&&print

      Thx all for you help!

      @tachyon:

      if i want also the posibility to search for all (*.*) files and all directory. How to extend for this your script?

      thx again ;o)

        Instead of changing the code, allow the user to pass some options in. If tachyon would be so kind as to allow me to lift his code:
        use strict; use warnings; use File::Find; use Getopt::Long; use vars qw($all @datein @exts @excl $base); GetOptions( 'exts|t=s' => \@exts, 'excl|c=s' => \@excl, 'base|b=s' => \$base, 'all|a' => \$all, ); if (not $all) { @exts = qw( jpg gif tar gz bat txt ) unless @exts; @excl = qw( winnt cygwin ) unless @excl; } $base ||= '/cygwin'; my $exts = join '|', map{quotemeta} @exts; my $excl = join '|', map{quotemeta} @excl; find( sub { push @datein, $File::Find::name if !$exts or m/\.(?:$exts)$/oi and $File::Find::name !~ m!/(?:$excl)!oi ; }, $base, ); print "$_\n" for @datein;
        Now you can call the script like so:
        ./find.pl -exts=txt -exts=gif -excl=nodes -base=.
        
        or this if you want everything:
        ./find.pl -base=/ -all
        
        If specifying the extensions one at a time becomes too tedious for you, you can rewrite the code to allow the user to pass a delimited string instead. Read up on the Getopt::Long docs for more info. Additionally, check out the Unix command find.

        jeffa

        L-LL-L--L-LL-L--L-LL-L--
        -R--R-RR-R--R-RR-R--R-RR
        B--B--B--B--B--B--B--B--
        H---H---H---H---H---H---
        (the triplet paradiddle with high-hat)
        

        Change the if /\.(?:$ext)/ clause to if -f $_

        cheers

        tachyon

        s&&rsenoyhcatreve&&&s&n.+t&"$'$`$\"$\&"&ee&&y&srve&&d&&print

Re: Need help with File::Find
by toadi (Chaplain) on Feb 26, 2003 at 15:25 UTC
    my (@TOP) = # top level directories qw(/www); sub PRUNE { # don't search these dirs ## $_[0] is basename, $_[1] is full path $_[0] =~ /private/; } sub IGNORE { # don't notice these files ## $_[0] is basename, $_[1] is full path $_[0] =~ /^\.|~$|\.(gif|jpe?g)$/; } find (sub { return $File::Find::prune = 1 if PRUNE $_, $File::Find::name; return unless -f; # only files return if IGNORE $_, $File::Find::name; }, @TOP);

    Stolen from merlyn long time ago, but I use it all the time for this situation.

    --
    My opinions may have changed,
    but not the fact that I am right