Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

Hi Monks

Could I seek some help on the following code please?

I'm trying to Stat the pattern matched directory but am coming up with zero. Any help would be appreciated.

find(\&all, @allpaths); #Open a file called Paths.bak open (LIST, ">$tempfilepath/Paths.bak") or die "$! error trying to + overwrite"; #Foreach line (path found) $p in the array @allpathlisting foreach my $p (@allpathlisting) { print LIST "$p\n"; foreach my $s (@sfiles) { print LIST " $s\n"; } } close LIST; copy $tempallpathsoldfile, $tempallpathsnewfile or die "$! error t +rying to copyfile"; #Sub routine for File::Find. on all locations sub all { #Unless directory skip. return unless -d; # Don't recurse past folders matching the pattern. $File::Find::prune = 1 if /[IPDLMY]\d{8}$/; #Replace / with \ (my $fn = $File::Find::name) =~ tr#/#\\#; #Replace gpfs_data with nas\rdds $fn =~ s/gpfs_data/nas\\rdds/; #Push into the @allpathlisting array any paths matching the patter +n. push @allpathlisting, $fn if /[IPDLMY]\d{8}$/; # How big is it? my $fsize = (stat($fn))[7]; push @sfiles,$fsize; print "$fsize\n"; #print results to the screen, comment this out in production. print "$fn\n" if /[IPDLMY]\d{8}$/; }

Replies are listed 'Best First'.
Re: File::Find stat question
by roboticus (Chancellor) on Nov 24, 2014 at 00:48 UTC

    Try doing the stat *before* you edit the path name into something that *nix doesn't understand....

    (Alternatively, do the stat on $File::find::name instead of $fn.)

    ...roboticus

    When your only tool is a hammer, all problems look like your thumb.

      Alternatively, do the stat on $File::fin­d::name

      Actually, you want to do the stat on $_. In typical usage, File::Find has chdir'd into "sub/dir" and found "file.foo" and thus has set $_ to "file.foo" and set $File::Find::name to "sub/dir/file.foo". So doing stat on $File::Find::name will usually fail (once you've gone at least one subdirectory deep).

      - tye        

        Thanks for your advice, however it still won't work I'm getting 0 printed for

        # How big is it? my $fsize = (stat($_))[7]; push @sfiles, $fsize; print "$fsize\n";

        Also instead of a 1MB file I'm getting a 700! MB file with this bit... so something isn't right at all :-(

        foreach my $p (@allpathlisting) #Foreach line (path found) $p in the a +rray @allpathlisting { #Write each line in the array to Paths.bak print LIST "$p\n"; foreach my $s (@sfiles) { print LIST "$s\n"; } }
Re: File::Find stat question
by wjw (Priest) on Nov 24, 2014 at 04:34 UTC
    I recently was wading through something like 8k worth of digital photos, looking for duplicates and doing other little chores and found that Path::Iterator::Rule worked pretty nicely. I did not read your code carefully, but it looks like that module might work as an alternative for you. The module does have -X tests and I found it to be pretty simple to configure rules.

    If nothing else, I find that looking at how another module approaches a problem sometimes helps me better understand the module I am working with...

    Just a thought...

    ...the majority is always wrong, and always the last to know about it...

    Insanity: Doing the same thing over and over again and expecting different results...

    A solution is nothing more than a clearly stated problem...otherwise, the problem is not a problem, it is simply an inconvenient fact

Re: File::Find stat question
by igoryonya (Pilgrim) on Nov 24, 2014 at 06:31 UTC
    IMHO in your code:
    #Replace / with \ (my $fn = $File::Find::name) =~ tr#/#\\#; #Replace gpfs_data with nas\rdds $fn =~ s/gpfs_data/nas\\rdds/; #Push into the @allpathlisting array any paths matching the patter +n. push @allpathlisting, $fn if /[IPDLMY]\d{8}$/; # How big is it? my $fsize = (stat($fn))[7];
    You modify the original file name twice, and only then, you stat the modified file name.
    so, if your original path /some/path/gpfs_data/some_file.ext
    you stat: \some\path\nas\rdds\some_file.ext
    That's a copletley different file, that probably doesn't exist
    Shouldn't you stat the file before modifying?

      Hi There

      I moved the stat up before I've modified the path names but I'm still getting a 0 for directory size, and I've found by printing $_ I'm not actually getting the required pattern match IPDLMY\d{8}$/ at times either... I think I'm going to have to resort to not using File::Find in this instance...

      sub all { #Unless directory skip. return unless -d; # Don't recurse past folders matching the pattern. $File::Find::prune = 1 if /[IPDLMY]\d{8}$/; #How big is it? my $fsize = (stat($_))[7]; print "$fsize\n"; push @sfiles, $fsize; #Replace / with \ (my $fn = $File::Find::name) =~ tr#/#\\#; #Replace gpfs_data with nas\rdds $fn =~ s/gpfs_data/nas\\rdds/; #Push into the @allpathlisting array any paths matching the patter +n. push @allpathlisting, $fn if /[IPDLMY]\d{8}$/; #print results to the screen, comment this out in production. print "$fn\n" if /[IPDLMY]\d{8}$/;
        I'm still getting a 0 for directory size

        Does your script run on a Windows Perl (Strawberry, ActiveState or the like)?

        Windows does not have a stat() system call. Perl emulates it, but not perfectly. I assume - without looking at the perl sources - that the struct stat is simply zeroed out and then filled depending on what the Windows API functions return. For directories, size is simply not touched and stays 0.

        Note that FAT and NTFS are very different from filesystems found on Unix systems, so having a "size" field for a directory may not make any sense at all.

        Also note that the "size" field is not at all related to the size of the files in the directory. On Unix systems, there is -- depending on the filesystem type -- a relation to the highest number of files in the directory over the lifetime of the directory, and maybe the length of the filenames. If you want the sum of the file sizes, calculate it by calling stat for each file in the directory.

        POSIX leaves the meaning of the st_size field undefined for directories. It is defined for regular files, symlinks, shared memory objects, and typed memory objects, but not for directories. ("For other file types, the use of this field is unspecified.") So, depending on your operating system, stat() may legally return 0 or just garbage for the size of a directory.

        My result for stat on a directory on Win7 NTFS and FAT, Strawberry Perl 5.14.2 is 0, too.

        On Linux 3.10.17, perl 5.18.1, ext3, ext2, procfs, sysfs, tmpfs, the results vary from 0 (procfs, sysfs) to 12288 (heavily used directory on ext3), matching the results from ls. The directory size of 0 on procfs is documented in stat(2), sysfs behaves in the same way.

        Alexander

        --
        Today I will gladly share my knowledge and experience, for there are no sweeter words than "I told you so". ;-)