in reply to Re: how to list the files in dir with respect to time
in thread how to list the files in dir with respect to time
My first mistake was not fully reading the "many_stats" routine; it is using the List::Util::reduce routine, not the sort routine as I'd assumed/glossed over. So, I added a test that did use the worst non-contrived combination of stat and sort, and THAT gave me the results I expected.
Rate most_stats graff_hash zaxo_first more_stats sk_maxfile the_ls_lrt
most_stats 4.64/s -- -82% -84% -86% -89% -97%
graff_hash 25.6 /s 452% -- -14% -22% -41% -82%
zaxo_first 29.8 /s 543% 16% -- -9% -31% -79%
more_stats 32.9 /s 608% 28% 10% -- -24% -76%
sk_maxfile 43.3 /s 832% 69% 45% 32% -- -69%
the_ls_lrt 140 /s 2909% 445% 368% 325% 223% --
Doing two stats for every compare in the sort routine is REALLY bad, and the next worst option is graff's caching the file dates in a hash, then sorting.Then we have the two attempts using List::Util::reduce; I don't quite understand how zaxo's caching of dates can be worse than stat'ing during each compare. My only guess would have to be the setup of the 2 dimensional array and all the dereferencing going on creates enough of a penalty that they out-weigh the stat calls.
Then we see that sk's function to pull only the newest file out as we're going through the array is a bit better than the reduce options. And finally letting the system's 'ls' routine do most of the work for us is far and away the best option (ignoring portability issues).
Code follows:
use strict; use warnings; use List::Util 'reduce'; use Benchmark qw(cmpthese); our $path = shift || '/usr/bin/*'; sub badsort { my @list = sort { (stat $b)[9] <=> (stat $a)[9] } glob "$path"; my $newest = $list[0]; } sub goodsort { my %file_date; for ( glob "$path" ) { $file_date{$_} = (stat)[9]; } my $newest = ( sort { $file_date{$b} <=> $file_date{$a} } keys %fi +le_date )[0]; } sub badreduce { my $newest = reduce { (stat $a)[9] < (stat $b)[9] ? $b : $a } glob "$path"; } sub goodreduce { my $newest = ( reduce { $a->[1] < $b->[1] ? $b : $a } map { [ $_, (stat)[9] ] } glob "$path" )->[0]; } sub d { my %file_date; my $max = -99999999; # set it to first file's mtime would be bette +r # but just for demonstration here my $mtime; my $file; for ( glob "$path" ) { $mtime = (stat)[9]; if ($max <= $mtime) { $file = $_; $max = $mtime; } } my $newest = $file; } sub e { my @array = `ls -lrt $path`; my $newest = $array[-1]; } cmpthese(250, { zaxo_first => \&goodreduce, graff_hash => \&goodsort, more_stats => \&badreduce, most_stats => \&badsort, sk_maxfile => \&d, the_ls_lrt => \&e, });
|
|---|