in reply to Analyzing Files Within a Directory

I am not sure about all of what you want to do. However, when dealing with multiple directories, the module File::Find is often the place to start. This will recurse down from a starting directory and call a subroutine for every file and directory underneath the starting directory. Some example code:
#!/usr/bin/perl use strict; use warnings; use File::Find; $|=1; # turns buffering off on STDOUT so # error messages to unbufferred STDERR wind up # in the right time line order my @directories_to_search = ('.', 'another dir here'); find( \&process_each_file, @directories_to_search ); sub process_each_file { return unless -f $_; # only simple files (no dirs) print "$File::Find::name ctime: ", -C _, "\n"; print "$File::Find::name mtime: ", -M _, "\n"; print "$File::Find::name atime: ", -A _, "\n"; }
File::Find will do a stat on the file for each file test operator function. stat() returns a big list of information which is cached. Above I used the special variable, underscore, "_" to access different parms of the cached information. This is not needed, but is much faster since only a local cache is being accessed rather than requiring another "expensive" file system operation. However, with only with 8 directories, I doubt this will matter at all in terms of performance.

Modify @directories_to_search with either relative paths to where this script executes from or absolute paths as per your OS requirements.

I have no idea about what you mean by "start" and "end" times? Can you clarify that?

Update: I looked again this and your terminology of "last create time" set off some warning bells. There isn't any such thing. atime is last time contents where read or written. mtime is the last time that the file contents were modified. ctime is "change time", not create time. This is the most recent of mtime OR the last time file permissions were changed. Whenever anything about a file changes (except its access time), its ctime changes. A Windows, NTFS file system does have the idea of a creation or "born on" time. There are some wild quirks about that and I think we will go far astray talking about that now. I highly doubt that you actually mean "creation time". A special API is needed to get this "creation time" on Windows, -C $filename will not do it.

Under almost all circumstances, the parameter of most interest is the mtime. The time that the contents of the file changed.

Replies are listed 'Best First'.
Re^2: Analyzing Files Within a Directory
by huck (Prior) on Mar 15, 2017 at 01:58 UTC

    i tend to use a construct like this

    sub find_temp{ my $dir=shift; my @txts; find( sub { return unless (-f $File::Find::name); push @txts,$File::Find::name;; } , $dir.'/tmp/'); return \@txts ; } # find temp
    these days, since i want to use the list later. in this case i might change
    push @txts,$File::Find::name;;
    to
    push @txts,[$File::Find::name, -C _, -M _, -A _];
    even if @txts is a bad name for it now. I knew the only thing in those dirs were .txt files and subdirs

Re^2: Analyzing Files Within a Directory
by xc63 (Initiate) on Mar 19, 2017 at 14:52 UTC
    Thanks so much for that info, sorry it took me a little to update this thread. I think when it comes to start and end times, I'm looking for when the first are last read and/or written. I am also attempting to parse out the results by field, e.g., the batch ID/job ID. I'm trying to make modifications to figure out to do the aforementioned. Will File::Find make it easier to find each with also specifying each individual file's batch ID/job ID numbers?
      "last read and/or written." that sounds like you need atime, access time. You could have say a file that has not been changed for a year, but was accessed just a second ago for reading. That might be true of a "work horse" program that is often used, but seldom modified.

      For the second part, "parse out the results by field, e.g., the batch ID/job ID.". That sounds like you need some sort of regex to do file name matching? You might want to consider File::Find::Rule. In some more complicated scenarios, this can make the program logic easier to understand and implement. My requirements are usually straight-forward enough that I don't need it, but you should at least be aware of this option.

      Update: A few more comments:

      Will File::Find make it easier to find each with also specifying each individual file's batch ID/job ID numbers? File::Find solves the problem of writing the code of recursively descending through the the directory structure. This is well tested code that works. You can then focus on the job of deciding what to do with each file. In O/S file system lingo, a directory is actually just another type of a "file". The -f test will tell you whether a name is a simple plain file or not as opposed to a directory, or some kind of link. Not that the "directories of '.' and '..'" will occur, but are normally skipped.

      Also note that huck made some good suggestions, although I am not sure if your level of experience allows you to completely understand his code. There are some "above beginner" aspects to it. Nothing derogatory is intended.

      I suggest you start with my code as a prototype and see how you get on with that. By all means ask if you have questions.

        that sounds like you need atime, access time.

        Just a note: Updating atime may be "expensive" in some ways (e.g. the computer may need to spin up a laptop's harddisk just to update the atime of a file that is already in the buffer cache), so there are mount options to delay or completely prevent updating the atime. For linux, search for "noatime", "strictatime", "relatime" in the mount manpage for details.

        Alexander

        --
        Today I will gladly share my knowledge and experience, for there are no sweeter words than "I told you so". ;-)