Use of undescore to reuse file stat data is slower

Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

Hi,
The perldoc says :
If any of the file tests (or either the "stat" or "lstat" operators) are given the special filehandle consisting of a solitary underline, then the stat structure of the previous file test (or stat operator) is used, saving a system call.
So I would expect that the code chunk
Method 1
---------

...
if (-e $path) {
    if (-l $path) {
    print "is a link";
    } elsif (-d _) {
    print "is a direcotry \n";
    } elsif (-f _) {
    print "Is a file";
    } else {
       print "Error";
    }
}
[download]

to be faster than
Method 2
---------

if (-e $path) {
    if (-l $path) {
    print "is a link";
    } elsif (-d $path) {
    print "is a direcotry \n";
    } elsif (-f $path) {
    print "Is a file";
     } else {
       print "Error";
    }
}
[download]

as the method1 does one less system call (at least) I expect method1 to be faster.
But when I benchmarked the results tell other wise :
%./reg.pl
Path is file
Path is file
Rate statreuse nostatreuse
statreuse 31847134/s -- -39%
nostatreuse 52083333/s 64% --
%cat reg.pl

#!/opt/perl_5.8.7/bin/perl
use Benchmark qw(:all);
my $path = "test.lnk";

my $coderef = sub {
    if (-e $path) {
        if (-l $path){
            print "path is link \n";
        } elsif (-d _) {
            print "Path is directory \n";
        } elsif (-f _) {
            print "Path is file \n";
        }
    }
};

my $coderef1 = sub {
    if (-e $path) {
        if (-l $path){
            print "path is link \n";
        } elsif (-d $path) {
            print "Path is directory \n";
        } elsif (-f $path) {
            print "Path is file \n";
        }
    }
};

 my $r = cmpthese( 50000000, {
     statreuse => $coderef->(),
     nostatreuse   => $coderef1->()
 });
[download]

How is that reuse of file stat data slower ?
Do I miss anything here ?
thanks,
sateesh

Comment on Use of undescore to reuse file stat data is slower Select or Download Code

Replies are listed 'Best First'.
Re: Use of undescore to reuse file stat data is slower by Fletch (Bishop) on Jul 10, 2006 at 14:44 UTC
Your benchmark code is wrong. You call each of your coderefs instead of passing them. You meant instead something like this: `cmpthese( 500_000, { statreuse => $coderef, nostatreuse => $coderef1, });` [download] With working benchmark code I see the `_` version running 78% faster than the separate calls. Update: Meh, typo'd the percentage faster. Actual results running 500,000 tests with `test.lnk` being a plain file: Rate nostatreuse statreuse nostatreuse 65189/s -- -44% statreuse 116279/s 78% -- Results from stock OS X perl v5.8.6 on a dual 2.7 G5.	[reply] [d/l] [select]
Re^2: Use of undescore to reuse file stat data is slower by Anonymous Monk on Jul 11, 2006 at 03:34 UTC
Thanks. It was error in my benchmark code, and also as others pointed the print was unnecessary. The _ version runs faster. thanks, sateesh	[reply]
Re: Use of undescore to reuse file stat data is slower by demerphq (Chancellor) on Jul 10, 2006 at 15:46 UTC
Its very unlikely you want to benchmark code with prints in it. At very least executing all those prints 50000000 times is going to take a lot of time streaming the same thing to your terminal over and over and over.... --- $world=~s/war/peace/g	[reply]
Re: Use of undescore to reuse file stat data is slower by eric256 (Parson) on Jul 10, 2006 at 17:41 UTC
I would recommend not printing. As it is your code should print out 100,000,000 times. That is going to be a heck of a lot of output and I would guess that the time required to print is going to drowned out any difference in the two methods. ___________ Eric Hodges	[reply]
Re: Use of undescore to reuse file stat data is slower by swampyankee (Parson) on Jul 10, 2006 at 15:00 UTC
Now, I'm about as far from expert on Perl's guts as possible, but I can see two possibilities: It takes more time for Perl to backtrack for underscore (_) to $path than it takes to make the query (not impossible) The version of Perl you're using doesn't optimize the different -X queries against the same file, and has to repeat the query each time. It takes some time to connect the "_" with the explicit filename. I suspect the first hypothesis is more likely. After all, getting information about files is an extremely common operation in any O/S, and I would think it would be one which the O/S programmers would try very hard to make very fast. added in update Or, of course, I'm completely off the mark and you should listen to Fletch. emc e^(π√−1) = −1	[reply]