Beefy Boxes and Bandwidth Generously Provided by pair Networks
Just another Perl shrine
 
PerlMonks  

File test in grep not excluding current directory

by GotToBTru (Prior)
on Jul 01, 2014 at 13:22 UTC ( [id://1091839]=perlquestion: print w/replies, xml ) Need Help??

GotToBTru has asked for the wisdom of the Perl Monks concerning the following question:

I am not sure if it is File::Find or grep that is responsible for the behavior I see.

use strict; use warnings; use File::Find; my $dir='/home/edi/wlsedi/howard/temp'; find({preprocess => sub { return grep { -M $_ < 1 } @_ }, wanted => sub { printf "%s\n",$_ if (-f $_) } }, $dir);

Output(as expected):

file1 file2 file3

I would rather have the directory case handled in preprocess than wanted.

use strict; use warnings; use File::Find; my $dir='/home/edi/wlsedi/data_backup/univfiledrop'; find({preprocess => sub { return grep { -f $_ && -M $_ < 1 } @_ }, wanted => sub { printf "%s\n",$_ } }, $dir);

Output:

. file1 file2 file3

Why is . included?

Update

File::Find calls the wanted function with the directory name before it performs the readdir() on the directory. The preprocess routine is not called for this invocation.

use strict; use warnings; use File::Find; my $p = 0; my $dir='/home/edi/wlsedi/howard/temp'; find({preprocess => sub { printf "p %d %s\n",$p++,$_; return @_ }, wanted => sub { printf "w %d %s\n",$p++,$_ } }, $dir);
ls -e /home/edi/wlsedi/howard/temp total 0 drwxr-xr-x- 2 wlsedi wlsedi 256 Jul 01 09:33 dirhere -rw-r--r--- 1 wlsedi wlsedi 0 Jul 01 08:14 file1 -rw-r--r--- 1 wlsedi wlsedi 0 Jul 01 08:14 file2 -rw-r--r--- 1 wlsedi wlsedi 0 Jul 01 08:14 file3

Output:

w 0 . p 1 . w 2 file1 w 3 file2 w 4 file3 w 5 dirhere p 6 dirhere
1 Peter 4:10

Replies are listed 'Best First'.
Re: File test in grep not excluding current directory (chdir)
by tye (Sage) on Jul 01, 2014 at 13:41 UTC

    The File::Find documentation says that 'preprocess' is meant for things that care only about the name. Doing file test operators is exactly the kind of thing that they are saying you should not do in 'preprocess'.

    'preprocess' happens before chdir is called. So the only reason you would get "File1" in the output would be if you have "File1" in both a directory and a subdirectory of that directory.

    Writing out $File::Find::name as well was $_ shows why you get "." with your second code. 'preprocess' is called after readdir(). Before you even get to readdir(), File::Find chdir()s into the directory you passed in and processes that directory itself, passing '.' to your 'wanted' sub.

    - tye        

      I can verify that it calls wanted before preprocess - but why? The docs indicate otherwise:

      "Your preprocessing function is called after readdir(), but before the loop that calls the wanted() function. It is called with a list of strings (actually file/directory names) and is expected to return a list of strings. The code can be used to sort the file/directory names alphabetically, numerically, or to filter out directory entries based on their name alone."

      That says to me that it filters before looping, which is exactly what I would expect, except it doesn't. Okay, lesson learned.

      1 Peter 4:10
Re: File test in grep not excluding current directory
by choroba (Cardinal) on Jul 01, 2014 at 13:29 UTC
    The preprocess function and grep work correctly. The problem is the current directory is processed once before the preprocess function is called - you can verify by adding a debug print
    preprocess => sub { print "preprocessing [@_]"; return grep -f $_ && -M _ < 1, @_ }
    لսႽ† ᥲᥒ⚪⟊Ⴙᘓᖇ Ꮅᘓᖇ⎱ Ⴙᥲ𝇋ƙᘓᖇ
Re: File test in grep not excluding current directory ( use File::Find::Rule qw/ find rule :Age /; )
by Anonymous Monk on Jul 01, 2014 at 15:47 UTC

    ## also loads File::Find::Rule::Age use File::Find::Rule qw/ find rule :Age /; { my @files = find( file => age => [ newer => '1D' ], in => [ $dir ] + ); dd( \@files ); my $counter = 0; my $printer = sub { $counter++; print "$counter $_[2]\n"; return !!0; ## means discard }; ## look ma, no iterator required :) ## nothing returned because $printer discards ... @files = rule( file => age => [ newer => '1D' ], exec => $printer +)->in( $dir ); dd( \@files ); } __END__ ["today/a/hi.txt", "today/b/bye.txt"] 1 today/a/hi.txt 2 today/b/bye.txt []
Re: File test in grep not excluding current directory
by Anonymous Monk on Jul 01, 2014 at 13:29 UTC

    What do you get after changing $_ to $_[0] in wanted sub reference?

      Oh never mind, for $_ is the *base*name of a file per pod (both $_ & $_[0] are useless as is sans directory change).

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://1091839]
Approved by toolic
Front-paged by perlfan
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others goofing around in the Monastery: (5)
As of 2024-04-18 04:30 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found