The documentation for File::Find states that an lstat() is guaranteed to have been called when the 'follow' or 'follow_fast' options have been called. As a result, you can use the underline stat cache in the wanted() function.

When not using the 'follow' or 'follow_fast' options, however, there is no such guarantee of lstat() being called -- it may be called, or it may not be, and, consequently, you can't rely on the underline stat cache containing the stat info of the current file. You have to do it yourself.

In a typical file tree that I'm trawling, I'm finding that File::Find has not done a stat() call on about 80% of the files. The other 20% of files have had a stat() call done prior to the wanted() function call. On one set of test data, that 20% represents 6800+ files. Our filesystem gets too much wear and tear already, so I'd like to avoid duplicating those stat calls when File::Find has already done them.

Edit: How can I know whether File::Find has called lstat() or not?

My first thought is, at the end of my wanted() function, to clear the _ cache. That doesn't seem to be possible, aside from doing something like -l q() (it appears that the lstat() system call simply returns ENOENT when passed an empty string, without going to the filesystem). Is there a better way of invalidating the contents of _? The perlfunc documentation mentions how it gets set, but I haven't yet found anything which indicates that cached data can ever be reset, other than by another call to a stat or filetest.

My second question is a direct consequence of clearing _: is it possible to detect that _ is, or is not, a valid stat cache? My current method is:

my ($dev, $inode) = do { no warnings; lstat _ }; ($dev, $inode) = lstat $_ unless $dev;

Without the no warnings, I get a warning stating lstat() on unopened filehandle _. Unfortunately, I can't find any documentation that indicates any way to test if _ is valid. I thought I might be able to check the defined-ness of *::_{FILEHANDLE}, but was wrong (it generates a warning about being deprecated, and always returns an IO::Handle object. I also tried the simpler

my ($dev, $inode) = _ ? lstat _ : lstat $_;
but that generates an error saying Bareword "_" not allowed while "strict subs" in use.

Any suggestions are welcome.

Cheers,


In reply to Can the special underline filehandle (stat cache) be cleared? by ammon

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.