Re: More Fileglob in scalar context question

Another option would be to skip the glob operator altogether, and use (open|read|close)dir instead.

This would skip the odd glob behavior, and also get around some other glob() limitations you might hit later. ( like "too many args").

Here's my shot at converting your short example:

for (my $i=0; $i < 5; $i++){
    opendir(DIR,$i) or die();
    foreach my $file (readdir(DIR)) {
        # if the filename ends in "test.txt"...
        # adjust the regex as needed
        if ($file =~ /test\.txt$/) {
            print "$i, -${file}-\n";
        }
    }
    closedir(DIR);
}
[download]

Update:per scain's comment, it looks like perl's glob was updated in 5.6.0 to use an internal routine on most implementations. OTOH, it looks like p5p is saying they will tie it to File::Glob, which sucks in Exporter, etc, probably slowing it down.

As far as which one is faster, using the short example, readdir() is at least twice as fast on perl 5.6.0 and hundreds of times faster on perl 5.5.003 ( where glob() still calls out to csh). Here's what I tried, after making a few test directories and files:

use Benchmark;
timethese(10000, {
    'readdir' => sub {
        my @blah;
        for (my $i=0; $i < 5; $i++){
            opendir(DIR,$i) or die();
            foreach my $file (readdir(DIR)) {
                if ($file =~ /test.txt$/) {
                    push(@blah,$file); 
                }
            }
            closedir(DIR);
        }
    },
    'glob' => sub { 
         my @blah;
         for (my $i=0; $i < 5; $i++){
             for (<$i/*test.txt>) {
                 push(@blah,$_);
             }
         }
 
    }
}
);

Results on my perl 5.6.x box:

Benchmark: timing 10000 iterations of glob, readdir...
      glob: 12 wallclock secs ( 
         3.24 usr +  2.22 sys =  5.46 CPU) 
             @ 1831.50/s (n=10000)
   readdir:  3 wallclock secs ( 
        1.54 usr +  1.69 sys =  3.23 CPU)
             @ 3095.98/s (n=10000)

on a 5.005_003 box (1000 iterations, because it's so slow):

Benchmark: timing 1000 iterations of glob, readdir...
      glob: 216 wallclock secs 
            ( 1.83 usr  9.21 sys + 
            63.22 cusr 107.38 csys =  0.00 CPU)
   readdir:  1 wallclock secs ( 
             0.43 usr +  0.37 sys =  0.80 CPU)
[download]

Comment on Re: More Fileglob in scalar context question Select or Download Code

Replies are listed 'Best First'.
Re: Re: More Fileglob in scalar context question by scain (Curate) on Jul 02, 2001 at 21:05 UTC
kschwab, I was under the impression that glob worked in this way now anyway, as opposed to the old way using the shell. If it does work the way I think, the "too many args" thing should be a thing of the past, no? And if that is true, it seems like it might be faster to get all of the files with one glob, and then loop through them. Thanks, Scott	[reply]