Beefy Boxes and Bandwidth Generously Provided by pair Networks
P is for Practical
 
PerlMonks  

Re: smart glob of dated subfolders

by Corion (Patriarch)
on Feb 22, 2023 at 20:18 UTC ( [id://11150539]=note: print w/replies, xml ) Need Help??


in reply to smart glob of dated subfolders

Both, glob and readdir will end up calling the same underlying C function, so the only way to actually avoid reading a large directory would be to introduce a processed/ folder where you move all folders that have already been processed.

Replies are listed 'Best First'.
Re^2: smart glob of dated subfolders
by Anonymous Monk on Feb 25, 2023 at 11:35 UTC
    Both, glob and readdir will end up calling the same underlying C function
    use strict; use warnings; use Cwd; use Benchmark; my $dir = 'c:/windows'; my ( @a1, @a2 ); timethese 1, { glob => sub { my $cwd = getcwd; chdir $dir; @a1 = glob '*/*'; chdir $cwd; }, read => sub { my $cwd = getcwd; chdir $dir; opendir my $h, '.' or die; my @a = grep { $_ ne '.' and $_ ne '..' and -d $_ } readdir $h +; for my $d ( @a ) { opendir my $hh, $d or next; push @a2, map "$d/$_", grep { $_ ne '.' and $_ ne '..' } readdir $hh; } chdir $cwd; } }; use Test::More; is $#a1, $#a2, 'array lengths are equal' or do { use Test::Differences; eq_or_diff [ sort @a1 ], [ sort @a2 ], 'look deeper', { context => 0 };; }; done_testing;

    I don't care much about 7 (out of ~27e3) entries missing in one case (something to do with leading dot in a name), but I wonder if orders of magnitude speed difference is what OP is observing for his large tree. My Perl's latest Strawberry, + fast NVMe storage.

    Benchmark: timing 1 iterations of glob, read... glob: 4 wallclock secs ( 0.30 usr + 3.34 sys = 3.64 CPU) @ 0 +.27/s (n=1) (warning: too few iterations for a reliable count) read: 0 wallclock secs ( 0.03 usr + 0.06 sys = 0.09 CPU) @ 10 +.53/s (n=1) (warning: too few iterations for a reliable count) not ok 1 - array lengths are equal # Failed test 'array lengths are equal' # at glob.pl line 39. # got: '27633' # expected: '27640' not ok 2 - look deeper # Failed test 'look deeper' # at glob.pl line 41. # +----+-----+----+-------------------------------------------+ # | Elt|Got | Elt|Expected | # +----+-----+----+-------------------------------------------+ # | | * 656| 'INF/.NET CLR Data', * # | | * 657| 'INF/.NET CLR Networking', * # | | * 658| 'INF/.NET CLR Networking 4.0.0.0', * # | | * 659| 'INF/.NET Data Provider for Oracle', * # | | * 660| 'INF/.NET Data Provider for SqlServer', * # | | * 661| 'INF/.NET Memory Cache 4.0', * # | | * 662| 'INF/.NETFramework', * # +----+-----+----+-------------------------------------------+ 1..2 # Looks like you failed 2 tests of 2.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://11150539]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others lurking in the Monastery: (5)
As of 2024-04-16 17:12 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found