propellerhat has asked for the wisdom of the Perl Monks concerning the following question:
After much searching and reading, the articles I have have found regarding File::Find do nothing other than list file names, though they begin by saying things such as "do something with file".
I need help or a tutorial which shows me how to:
(1) run an IF check on files located by File::Find
(2) read from and write to a file which passes the IF check
I suppose File::Find returns whatever I need to open a filehandle, but I have not located the File::Find specification.
Re: file modifications using file::find
by kcott (Archbishop) on Jul 25, 2021 at 10:06 UTC
|
G'day propellerhat,
Welcome to the Monastery.
I created some directories and populated them with files having varying contents and permissions:
$ for i in a b c; do cd $i; echo "DIR: `pwd`"; ls -l; cd ..; done
DIR: /home/ken/tmp/pm_11135362/a
total 1
-rw-r--r-- 1 ken None 0 Jul 25 16:44 empty
---------- 1 ken None 18 Jul 25 16:45 no_access
DIR: /home/ken/tmp/pm_11135362/b
total 1
-r--r--r-- 1 ken None 20 Jul 25 16:49 read_only
DIR: /home/ken/tmp/pm_11135362/c
total 1
-rw-r--r-- 1 ken None 7 Jul 25 16:51 read_write
I then wrote the following script which does various checks.
The else blocks (with "OK to READ/WRITE") are where you'd call your reading/writing routines.
#!/usr/bin/env perl
use strict;
use warnings;
use Cwd;
use File::Find;
my $cwd = getcwd();
my @dirs = map "$cwd/$_", qw{a b c};
print "--- READING ---\n";
find(\&wanted_to_read, @dirs);
print "--- WRITING ---\n";
find(\&wanted_to_write, @dirs);
sub wanted_to_read {
if (! -f $File::Find::name) {
print "$File::Find::name is not a normal file.\n";
}
elsif (-z _) {
print "$File::Find::name is zero-length.\n";
}
elsif (! -r _) {
print "$File::Find::name is not readable.\n";
}
else {
print "OK to READ: $File::Find::name\n";
}
return;
}
sub wanted_to_write {
if (! -f $File::Find::name) {
print "$File::Find::name is not a normal file.\n";
}
elsif (! -r _) {
print "$File::Find::name is not readable.\n";
}
elsif (! -w _) {
print "$File::Find::name is not writable.\n";
}
else {
print "OK to WRITE: $File::Find::name\n";
}
return;
}
A sample run outputs:
ken@titan ~/tmp/pm_11135362
$ ./pm_11135362_file_find_example.pl
--- READING ---
/home/ken/tmp/pm_11135362/a is not a normal file.
/home/ken/tmp/pm_11135362/a/empty is zero-length.
/home/ken/tmp/pm_11135362/a/no_access is not readable.
/home/ken/tmp/pm_11135362/b is not a normal file.
OK to READ: /home/ken/tmp/pm_11135362/b/read_only
/home/ken/tmp/pm_11135362/c is not a normal file.
OK to READ: /home/ken/tmp/pm_11135362/c/read_write
--- WRITING ---
/home/ken/tmp/pm_11135362/a is not a normal file.
OK to WRITE: /home/ken/tmp/pm_11135362/a/empty
/home/ken/tmp/pm_11135362/a/no_access is not readable.
/home/ken/tmp/pm_11135362/b is not a normal file.
/home/ken/tmp/pm_11135362/b/read_only is not writable.
/home/ken/tmp/pm_11135362/c is not a normal file.
OK to WRITE: /home/ken/tmp/pm_11135362/c/read_write
I don't know what your reading/writing requirements are.
See the open function, in the first instance, if you're unsure about that.
Feel free to ask further questions about that if need be.
I also don't know what you mean by "IF check".
It's not mentioned in the File::Find documentation.
By itself, "IF" has a number of potentially valid interpretations in the context of your question
(for instance, in "What does IF stand for?");
and you give no indication of what you intend to check.
I've used a number of "file tests" which is possibly the sort of thing you want.
[See "How do I post a question effectively?" for information on how you can help us to help you.]
Be very careful with specifying directories when using File::Find.
I used Cwd for my example code, but that would have various problems in a production environment.
The FindBin module may be useful if your target directories are always located
relative to your script.
Better options are to get the directories from a known source; e.g. a database, config file, or the like.
| [reply] [d/l] [select] |
|
Ken, I like your code! Just a few comments:
You snuck in the tests like "-z _". This is completely correct usage. As extra explanation for the OP, a file test operation is actually a fairly "expensive" file system operation. See: File:stat. When doing multiple tests on the same file, for the first, test the file name. This causes a structure with all kinds of stuff to be returned from the file system. For the 2nd, 3rd, etc. tests, use "_" instead of the file name and this enables Perl to return cached info based upon the last big stat request from the file system - meaning that these subsequent tests go a lot faster.
About cwd.. It is not clear what the OP intends to do on files that pass "whatever the file test(s) are". I strive to do minimal processing within the File::find wanted routine. The reason is that File::find will cwd down the directory structure as it goes about its business. If it "blows up" because maybe some complicated "process a .pdf file" routine blows up which got called from within File::find, you will be left in some random place in the file structure far removed from whatever directory the script started in. That can complicate recovery error procedures. So usually I just generate a "to-do" list within the file find procedure and then do the actual complicated work once all the files have been found. Now of course there are a lot of "yeah, buts" to that general approach. Mileage certainly does vary! I am just saying that in my experience, keeping the "wanted routine" simple is a good idea.
Update: The OP wrote: After much searching and reading, the articles I have have found regarding File::Find do nothing other than list file names, though they begin by saying things such as "do something with file". In general, I would make an array, my @found; Have the "wanted" routine push applicable $File:find:name onto that array and then process those files once File::find has finished its job. Keeping the "wanted" routine simple and restricting its job to just "finding files" to operate upon can save a lot of grief.
| [reply] [d/l] |
|
G'day Marshall,
"Ken, I like your code!"
Thanks. I appreciate the compliment.
'You snuck in the tests like "-z _".'
Unfortunately, the OP gave very little information about Perl experience or, indeed, File::Find usage requirements.
As this was a first post, I didn't make a big deal about it;
although, I did provide a link "for information on how you can help us to help you".
My main aim was to provide the requested "help or a tutorial".
I did consider including an explanation for the special filehandle, '_';
however, as I didn't know if these tests would be used, I chose to add a link to that information.
I did use the link text "file tests", in case "-X" was too cryptic. :-)
"About cwd"
The last paragraph of my post did include a warning about specifying directories;
an explanation that I'd only used Cwd for my example code;
and, suggested a number of better alternatives.
Furthermore, I only used the getcwd() function to generate the required @directories_to_search for find();
there was no explicit changing of directories in my script.
Of course, there is the implicit changing of directories as part of File::Find's default behaviour.
You can change that: see "File::Find - %options - no_chdir".
Your comments regarding using an array are sound.
You may actually want to use the list more than once.
I assume you know how to do this, but for the OP or anybody else, here's a very basic (partial) code example:
...
find(\&wanted, @dirs);
do_something_with(get_found_files());
check_something_done_with(get_found_files());
...
{
my @found_files;
sub get_found_files { return @found_files }
sub wanted {
...
push @found_files, $File::Find::name;
...
}
}
Note that the anonymous block makes @found_files (lexically) private:
only get_found_files() and wanted() have access to it.
You can, of course, modify the list returned, but the original will stay intact.
For anyone unfamiliar with this concept,
see "perlsub: Private Variables via my()"
for a more in-depth discussion of this topic.
| [reply] [d/l] [select] |
|
|
> I strive to do minimal processing within the File::find wanted routine
So do I! I find I grow fewer grey hairs that way. ;)
To illustrate, here's a simple example of using File::Find to find all .txt files under the current working directory.
use strict;
use warnings;
use Cwd;
use File::Find;
# Return a list of the absolute path of all plain .txt files under $di
+r
sub FindTextFiles
{
my $dir = shift;
my @files;
# Note: -f = plain file (perldoc -f -X for doco of all file tests)
find( { no_chdir => 1,
wanted => sub { -f && /\.txt$/ and push @files, $File::Fi
+nd::name }
},
$dir
);
return @files;
}
my $dir = getcwd();
my @txtfiles = FindTextFiles($dir);
print "Found ", scalar(@txtfiles), " text files under dir '$dir'...\n"
+;
for my $file (@txtfiles) {
print "file='$file'\n";
# could add code to modify the found files here ...
}
Example output of running this program:
Found 5 text files under dir 'C:/pm/file-find'...
file='C:/pm/file-find/example.txt'
file='C:/pm/file-find/fred/f1/zz.txt'
file='C:/pm/file-find/fred/f2/hello.txt'
file='C:/pm/file-find/fred/f2/f2a/2a.txt'
file='C:/pm/file-find/fred/f2/f2a/hello.txt'
Once I've built the list of files, I sometimes set about changing them in-place -- which is surprisingly tricky to do robustly,
as described at Re-runnably editing a file in place (see also CPAN File::Replace by haukex, which nicely solves this problem).
A practice exercise for the OP: extend the test program above to change all occurrences of Peking to Beijing
in the .txt files (which I sometimes torture job applicants with :).
| [reply] [d/l] [select] |
Re: file modifications using file::find
by eyepopslikeamosquito (Archbishop) on Jul 25, 2021 at 12:06 UTC
|
Welcome to the monastery propellerhat!
It would be great if you could provide us with a bit more context about yourself and your problem, including why you need to solve it.
- Are you an experienced Perl programmer or new to Perl?
- Which platform/s do you need your script to run on? Unix? (if so, which flavour?). Do you require that it runs on Windows too?
That will allow us to provide you with more helpful answers on how to get the most out of Perl's excellent File::Find module.
As indicated here, Perl's File::Find module has many advantages compared to Unix shell and its find command.
I also know from personal experience that File::Find is portable and works well under Windows too - though there are
some pitfalls you need to be aware of, due to the underlying differences between Unix and Windows file systems.
| [reply] [d/l] |
Re: file modifications using file::find
by Anonymous Monk on Jul 25, 2021 at 03:32 UTC
|
After much searching and reading, I have not located the File::Find specification.
... did you go to http://perldoc.perl.org and type File::Find into the search box, or type perldoc File::Find at the command line? | [reply] |
Re: file modifications using file::find
by Anonymous Monk on Jul 25, 2021 at 02:14 UTC
|
| [reply] |
|
|