hitesh_ofdoon has asked for the wisdom of the Perl Monks concerning the following question:

Hi.

I want to pipe output of find UNIX command to perl to get atime for the files. I am not sure I am not getting it. The following correctly gives atime:

/sasdata=>perl -e '$l=$ARGV[0]; @d=localtime ((stat($l))[8]);printf "% +4d%02d%02d %s\n",$d[5]+1900,$d[4]+1,$d[3],$l' mhugo01 20090824 mhugo01

But the following does not:

/sasdata=>find /sasdata/it/development/sasmonitoring/code -type f |per +l \ hsharm01@sasbsp20> -e 'while (<>) {$l=$_;@d=localtime((stat($l))[8]);p +rintf "%4d%02d%02d %s",$d[5]+1900,$d[4]+1,$d[3],$l}' 19691231 /sasdata/it/development/sasmonitoring/code/growthmonitor.sas 19691231 /sasdata/it/development/sasmonitoring/code/growthmonitor.sh

Both are doing stat on the variable l and l has correct value in both.

Replies are listed 'Best First'.
Re: stat on find output
by kennethk (Abbot) on Oct 19, 2009 at 21:09 UTC
    Your problem is with new lines. Specifically, your input from find ends each line with a new line, but of course there is no file "growthmonitor.sas\n" on your system, so the stat fails and all elements of @d are the start of the epoch. You can fix your code by chomping your input before feeding it to stat; of course, you also need to add a newline to your output. The following will do what you expect:

    find /sasdata/it/development/sasmonitoring/code -type f |perl -e 'while (<>) {chomp;$l=$_;@d=localtime((stat($l))[8]);printf "%4d%02d%02d %s\n",$d[5]+1900,$d[4]+1,$d[3],$l}'

      Yes, it was chomp. It's working now. Thank you guys for responding.
Re: stat on find output
by jakobi (Pilgrim) on Oct 19, 2009 at 21:49 UTC
    If you chomped line-ends as kennethk suggested, but things still look funny,

    here's another one, typically encountered with Unix servers in a larger compute center:

    Do check mount output and every mtab / fstab you can lay your hands on, both locally and on remote NFS servers: grep for a mount option like noatime or similar (note that Linux offers some fake-atime options between accurate atime and noatime).

    Note that in such setups, there's a high likelihood of e.g. /sasdata and and /sasdata/it/development/ actually being different (NFS?) filesystems.

    {"Millions of files and 5 TB" sounds like a slow and well-known scenario} (comment below)

    Consider using _ as "filename", as it allows you to reuse your stat w/o going again all the way to file cache or blockbuffer. Depending on flux, it might be worthwhile to redirect find output to a file. Better yet fold the find into Perl proper, possibly doing a test with File::Find vs readdir. Assuming seek times are dominant and the system's under load, you might nearly half your number of seeks.

      Yes, it is a huge solaris server and I am trying to get last read on millions of files on a 5 TB storage.

        Maybe a more readable / portable / simple approach would be to be more perlish and use File::Util and the last_access | last_modified | etc... functions? Maybe this doesn't apply to your server situation?

        Just a something something...
Re: stat on find output
by zwon (Abbot) on Oct 19, 2009 at 21:02 UTC

    What OS is this? What file system is mounted on /sasdata? Does it support atime?

    Update: kennethk is probably right and the problem is in the line endings, still I don't understand why you're getting 31-12-1969, for me it gives 01-01-1970:

    $ perl -e'printf "%02d-%02d-%02d\n", (localtime(undef))[5,4,3]' 70-00-01
      Time zone - since the epoch starts at midnight on Jan 1, 1970 GMT, people west of the Greenwich Meridian (Prime Meridian) will see that as occurring in the previous year.