propellerhat has asked for the wisdom of the Perl Monks concerning the following question:

After much searching and reading, the articles I have have found regarding File::Find do nothing other than list file names, though they begin by saying things such as "do something with file". I need help or a tutorial which shows me how to: (1) run an IF check on files located by File::Find (2) read from and write to a file which passes the IF check I suppose File::Find returns whatever I need to open a filehandle, but I have not located the File::Find specification.

Replies are listed 'Best First'.
Re: file modifications using file::find
by kcott (Archbishop) on Jul 25, 2021 at 10:06 UTC

    G'day propellerhat,

    Welcome to the Monastery.

    I created some directories and populated them with files having varying contents and permissions:

    $ for i in a b c; do cd $i; echo "DIR: `pwd`"; ls -l; cd ..; done DIR: /home/ken/tmp/pm_11135362/a total 1 -rw-r--r-- 1 ken None 0 Jul 25 16:44 empty ---------- 1 ken None 18 Jul 25 16:45 no_access DIR: /home/ken/tmp/pm_11135362/b total 1 -r--r--r-- 1 ken None 20 Jul 25 16:49 read_only DIR: /home/ken/tmp/pm_11135362/c total 1 -rw-r--r-- 1 ken None 7 Jul 25 16:51 read_write

    I then wrote the following script which does various checks. The else blocks (with "OK to READ/WRITE") are where you'd call your reading/writing routines.

    #!/usr/bin/env perl use strict; use warnings; use Cwd; use File::Find; my $cwd = getcwd(); my @dirs = map "$cwd/$_", qw{a b c}; print "--- READING ---\n"; find(\&wanted_to_read, @dirs); print "--- WRITING ---\n"; find(\&wanted_to_write, @dirs); sub wanted_to_read { if (! -f $File::Find::name) { print "$File::Find::name is not a normal file.\n"; } elsif (-z _) { print "$File::Find::name is zero-length.\n"; } elsif (! -r _) { print "$File::Find::name is not readable.\n"; } else { print "OK to READ: $File::Find::name\n"; } return; } sub wanted_to_write { if (! -f $File::Find::name) { print "$File::Find::name is not a normal file.\n"; } elsif (! -r _) { print "$File::Find::name is not readable.\n"; } elsif (! -w _) { print "$File::Find::name is not writable.\n"; } else { print "OK to WRITE: $File::Find::name\n"; } return; }

    A sample run outputs:

    ken@titan ~/tmp/pm_11135362 $ ./pm_11135362_file_find_example.pl --- READING --- /home/ken/tmp/pm_11135362/a is not a normal file. /home/ken/tmp/pm_11135362/a/empty is zero-length. /home/ken/tmp/pm_11135362/a/no_access is not readable. /home/ken/tmp/pm_11135362/b is not a normal file. OK to READ: /home/ken/tmp/pm_11135362/b/read_only /home/ken/tmp/pm_11135362/c is not a normal file. OK to READ: /home/ken/tmp/pm_11135362/c/read_write --- WRITING --- /home/ken/tmp/pm_11135362/a is not a normal file. OK to WRITE: /home/ken/tmp/pm_11135362/a/empty /home/ken/tmp/pm_11135362/a/no_access is not readable. /home/ken/tmp/pm_11135362/b is not a normal file. /home/ken/tmp/pm_11135362/b/read_only is not writable. /home/ken/tmp/pm_11135362/c is not a normal file. OK to WRITE: /home/ken/tmp/pm_11135362/c/read_write

    I don't know what your reading/writing requirements are. See the open function, in the first instance, if you're unsure about that. Feel free to ask further questions about that if need be.

    I also don't know what you mean by "IF check". It's not mentioned in the File::Find documentation. By itself, "IF" has a number of potentially valid interpretations in the context of your question (for instance, in "What does IF stand for?"); and you give no indication of what you intend to check. I've used a number of "file tests" which is possibly the sort of thing you want. [See "How do I post a question effectively?" for information on how you can help us to help you.]

    Be very careful with specifying directories when using File::Find. I used Cwd for my example code, but that would have various problems in a production environment. The FindBin module may be useful if your target directories are always located relative to your script. Better options are to get the directories from a known source; e.g. a database, config file, or the like.

    — Ken

      Ken, I like your code! Just a few comments:

      You snuck in the tests like "-z _". This is completely correct usage. As extra explanation for the OP, a file test operation is actually a fairly "expensive" file system operation. See: File:stat. When doing multiple tests on the same file, for the first, test the file name. This causes a structure with all kinds of stuff to be returned from the file system. For the 2nd, 3rd, etc. tests, use "_" instead of the file name and this enables Perl to return cached info based upon the last big stat request from the file system - meaning that these subsequent tests go a lot faster.

      About cwd.. It is not clear what the OP intends to do on files that pass "whatever the file test(s) are". I strive to do minimal processing within the File::find wanted routine. The reason is that File::find will cwd down the directory structure as it goes about its business. If it "blows up" because maybe some complicated "process a .pdf file" routine blows up which got called from within File::find, you will be left in some random place in the file structure far removed from whatever directory the script started in. That can complicate recovery error procedures. So usually I just generate a "to-do" list within the file find procedure and then do the actual complicated work once all the files have been found. Now of course there are a lot of "yeah, buts" to that general approach. Mileage certainly does vary! I am just saying that in my experience, keeping the "wanted routine" simple is a good idea.

      Update: The OP wrote: After much searching and reading, the articles I have have found regarding File::Find do nothing other than list file names, though they begin by saying things such as "do something with file". In general, I would make an array, my @found; Have the "wanted" routine push applicable $File:find:name onto that array and then process those files once File::find has finished its job. Keeping the "wanted" routine simple and restricting its job to just "finding files" to operate upon can save a lot of grief.

        G'day Marshall,

        "Ken, I like your code!"

        Thanks. I appreciate the compliment.

        'You snuck in the tests like "-z _".'

        Unfortunately, the OP gave very little information about Perl experience or, indeed, File::Find usage requirements. As this was a first post, I didn't make a big deal about it; although, I did provide a link "for information on how you can help us to help you".

        My main aim was to provide the requested "help or a tutorial". I did consider including an explanation for the special filehandle, '_'; however, as I didn't know if these tests would be used, I chose to add a link to that information. I did use the link text "file tests", in case "-X" was too cryptic. :-)

        "About cwd"

        The last paragraph of my post did include a warning about specifying directories; an explanation that I'd only used Cwd for my example code; and, suggested a number of better alternatives. Furthermore, I only used the getcwd() function to generate the required @directories_to_search for find(); there was no explicit changing of directories in my script.

        Of course, there is the implicit changing of directories as part of File::Find's default behaviour. You can change that: see "File::Find - %options - no_chdir".

        Your comments regarding using an array are sound. You may actually want to use the list more than once. I assume you know how to do this, but for the OP or anybody else, here's a very basic (partial) code example:

        ... find(\&wanted, @dirs); do_something_with(get_found_files()); check_something_done_with(get_found_files()); ... { my @found_files; sub get_found_files { return @found_files } sub wanted { ... push @found_files, $File::Find::name; ... } }

        Note that the anonymous block makes @found_files (lexically) private: only get_found_files() and wanted() have access to it. You can, of course, modify the list returned, but the original will stay intact. For anyone unfamiliar with this concept, see "perlsub: Private Variables via my()" for a more in-depth discussion of this topic.

        — Ken

        > I strive to do minimal processing within the File::find wanted routine
        So do I! I find I grow fewer grey hairs that way. ;) To illustrate, here's a simple example of using File::Find to find all .txt files under the current working directory.
        use strict; use warnings; use Cwd; use File::Find; # Return a list of the absolute path of all plain .txt files under $di +r sub FindTextFiles { my $dir = shift; my @files; # Note: -f = plain file (perldoc -f -X for doco of all file tests) find( { no_chdir => 1, wanted => sub { -f && /\.txt$/ and push @files, $File::Fi +nd::name } }, $dir ); return @files; } my $dir = getcwd(); my @txtfiles = FindTextFiles($dir); print "Found ", scalar(@txtfiles), " text files under dir '$dir'...\n" +; for my $file (@txtfiles) { print "file='$file'\n"; # could add code to modify the found files here ... }

        Example output of running this program:

        Found 5 text files under dir 'C:/pm/file-find'... file='C:/pm/file-find/example.txt' file='C:/pm/file-find/fred/f1/zz.txt' file='C:/pm/file-find/fred/f2/hello.txt' file='C:/pm/file-find/fred/f2/f2a/2a.txt' file='C:/pm/file-find/fred/f2/f2a/hello.txt'

        Once I've built the list of files, I sometimes set about changing them in-place -- which is surprisingly tricky to do robustly, as described at Re-runnably editing a file in place (see also CPAN File::Replace by haukex, which nicely solves this problem).

        A practice exercise for the OP: extend the test program above to change all occurrences of Peking to Beijing in the .txt files (which I sometimes torture job applicants with :).

Re: file modifications using file::find (File::Find References)
by eyepopslikeamosquito (Archbishop) on Jul 25, 2021 at 12:06 UTC
Re: file modifications using file::find
by Anonymous Monk on Jul 25, 2021 at 03:32 UTC
    After much searching and reading, I have not located the File::Find specification.

    ... did you go to http://perldoc.perl.org and type File::Find into the search box, or type perldoc File::Find at the command line?

Re: file modifications using file::find
by Anonymous Monk on Jul 25, 2021 at 02:14 UTC