in reply to Can the special underline filehandle (stat cache) be cleared?

You can get the same guantee by simply turning off the check-nlinks "optimization":

$File::Fin­d::dont_us­e_nlink= 1;

as I mention from-time-to-time such as in Re: File::Find in a thread safe fashion (speed).

It is unfortunate that this fact isn't documented. It is also sad that this "fastest for the most common cases" method is not easier to use and that $dont_use_nlink has nearly been eliminated from the documentation -- the maintainers continue their delusion that they have succeeded in this latest repeat of trying yet again to make this "optimization" safe and refuse to realize that this "optimization" actually makes typical uses of File::Find slower (and more complex) than they need to be (it only speeds up very limited cases where you are selecting files solely based on their names, except when it just doesn't work right).

Update: Doing some quick testing I find that File::Find may have broken this guarantee (as I said in the linked node, the code is now too complex for me to easily see that the guarantee still applies like it did ages ago when I first discovered this trick) and File::Find may no longer give faster results when you use it this way (like it did ages ago). My first suspicion given both of these results (which may be a result of flawed testing on my part, as it was just a quick hack of a test) is that File::Find is doing more than it needs to, but I'm not going to spend more time trying to figure out what's really going on with this module that I never use. Indeed, just rolling my own replacement for File::Find cuts my run-time in half.

My quick hack of a test also validates Tanktalus' point, though testing eons ago did show a speed-up for me. Tanktalus++

- tye        

Replies are listed 'Best First'.
Re^2: Can the special underline filehandle (stat cache) be cleared? (nlinks)
by ammon (Sexton) on Oct 05, 2006 at 01:18 UTC
    Ah, thanks for the suggestion about $File::Find::dont_use_nlink. I'll take a look at it. Unfortunately, I've spent enough time in File::Find that it doesn't look as complex as it used to (I've submitted patches for a couple bugs, as a result). :-}

    Update: yup... it's no longer a guarantee that a stat was done on the file.

    However, the more I look at File::Find, the more I want to follow your lead, and roll my own replacement for File::Find -- it's not exactly a stellar example of maintainable code.

    Cheers,

      So, I'm curious. When you have $File::Find::dont_use_nlink set, in what case does File::Find not lstat a file ?

      - tye        

        At minimum, it doesn't provide a valid lstat on directories immediately below the search root:

        # file `perl -MFile::Find -e'$File::Find::dont_use_nlink = 1; find(sub + { lstat _ or print "$File::Find::name\n"; -l "" }, "/tmp")'` /tmp/opt: directory /tmp/hsperfdata_ammon: directory /tmp/.ICE-unix: sticky directory /tmp/xmmskde: directory /tmp/gpg-SRkPcN: directory /tmp/kde-root: directory /tmp/0165158219: directory /tmp/gconfd-root: directory /tmp/scrollkeeper-root: directory /tmp/hsperfdata_root: directory /tmp/gpg-ZV9kSX: directory /tmp/kde-steve: directory /tmp/ksocket-root: directory /tmp/gpg-h5tTLO: directory /tmp/1051437914: directory /tmp/gpg-I5lCsj: directory /tmp/.X11-unix: sticky directory /tmp/orbit-root: directory

        (Of course, if your system equates lstat "" with lstat ".", you'll need to -l a non-existing file to clear _.)

        In the above example, there are sub-trees below some of those directories listed, and they do have a valid stat cache. I don't know, off hand, if there are any links in /tmp at the moment, so I can't say how symlinks are handled.

        I should note that this test is using File::Find version 1.07, which is a bit behind what's available on CPAN, but I don't recall the newer version behaving any different when I tested it.

        At any rate, the custom module I wrote ended up being 50% faster than File::Find, when tested on the 30K file tree mentioned in the original post, and the same speed as File::Find on very small trees. Best of all, though? It does exactly what I need. :)