Re: Sort directories by date
by Corion (Patriarch) on Sep 28, 2015 at 17:49 UTC
|
| [reply] |
Re: Sort directories by date
by graff (Chancellor) on Sep 29, 2015 at 03:53 UTC
|
Believe it or not, your question may actually be ambiguous. The "latest" directory could either be the one whose name represents the most recent month-day-year date, OR the one whose modification date is the most recent (i.e. the one that has most recently had a file added, deleted, renamed, etc.). The other replies above all take it for granted that the first of these two was your intended meaning.
But maybe those two possible interpretations happen to yield the same relative ordering of directories - that is, it could be that no changes ever occur in the contents of "09302015" once "10012015" is created (and likewise for the latter, once "10022015" is created). If this is true, you can sort the directories based on their modification dates:
my $dir = "/my_dir";
opendir DIR, $dir or die "$dir: $!\n";
my %subdir;
while ( my $month_dir = readdir( DIR ) {
next unless ( $month_dir =~ /^\d{8}$/ );
$subdir{$month_dir} = -M $month_dir;
}
my $latest_dir = ( sort {$subdir{$a}<=>$subdir{$b}} keys %subdir )[0];
# ...
Of course, if you really, truly meant for the actual directory names to be the basis for sorting, then the Schwartzian Transform is probably the most economical: instead of using the hash and while loop shown above, just do this:
my $latest_dir =
( map{s/(....)(....)/$2$1/; $_} sort map{s/(....)(....)/$2$1/; $_}
grep /^\d{8}$/, readdir DIR )[-1];
| [reply] [d/l] [select] |
Re: Sort directories by date
by poj (Abbot) on Sep 28, 2015 at 18:52 UTC
|
#!perl
use strict;
sub direc {
my $dir = "/my_dir";
opendir DIR, $dir;
# Read in all directories in /my_dir first
my @month_dir = grep /^\d{8}$/, readdir(DIR);
my %ymd = map {/(\d{4})(\d{4})/;$2.$1,$_} @month_dir ;
my $latest_ymd = (sort keys %ymd)[-1];
my $latest_dir = $ymd{$latest_ymd};
closedir(DIR);
# Now read in .txt files in latest directory found
opendir LATESTDIR, "$dir/$latest_dir";
my @files = grep /\.txt$/, readdir(LATESTDIR);
closedir(LATESTDIR);
return \@files;
}
poj | [reply] [d/l] |
Re: Sort directories by date
by RichardK (Parson) on Sep 28, 2015 at 18:35 UTC
|
It would have been much easier if you'd created the directory names as 'YYYYMMDD' then they would have sorted naturally with little effort.
BTW, opendir/readdir are too low level and always a pain to use unless you really have to. You could try File::Find::Rule instead
use v5.20;
use warnings;
use File::Find::Rule;
my @dirs = File::Find::Rule->directory()->maxdepth(1)->in('.');
say $_ for @dirs;
| [reply] [d/l] |
Re: Sort directories by date
by Laurent_R (Canon) on Sep 29, 2015 at 10:30 UTC
|
Although it may not matter that much (depending on how many subdirectories you have in your root directory), sorting the whole list just to get the most recent item is somewhat inefficient, even with a fast sorting algorithm and using Schwartzian transform or Guttman-Rosler transform, because it requires the computer to do much more work than what is actually needed.
I usually would not care that much about that with a short list of subdirectories, but it sometimes matter that there are more efficient algorithms to pick up the latest (or largest, or smallest, whatever) element in a list.
For example, something like this at the command line (quick test):
$ perl -e '
> my @list = qw/12112014
> 01052015
> 02202015
> 03102015
> 01012011
> 10102014
> 04092015
> 09092015
> 09092013/;
> chomp @list;
> my $max_y = "0000";
> my $max_d = "0000";
> for my $dir (@list) {
> my ($d, $y) = $dir =~ /(\d{4})(\d{4})/;
> if ($y > $max_y) {
> $max_y = $y;
> } elsif ($y == $max_y) {
> $max_d = $d if $d > $max_d;
> }
> }
> print "$max_d$max_y\n";
> '
09092015
This may look slightly more complex, but it is more efficient for a long list of directories. Which is why I would care only if the list is long.
| [reply] [d/l] |
|
|
c:\@Work\Perl\monks>perl -wMstrict -le
"use List::Util qw(maxstr);
;;
my @dates = qw(
12112014
12012014
01052015
12202014
12022014
02202015
03102015
01012011
09092015
04092015
);
;;
my $most_recent =
unpack 'x4a*',
maxstr
map pack('a4a*', unpack('x4a4', $_), $_),
@dates
;
;;
print $most_recent;
"
09092015
Use minstr for least-recent date. See the core module List::Util. No efficiency/performance testing done nor claims made. (Update: Actually, I'd be surprised if there's any advantage unless you're dealing with really large lists; default sort (with no subroutine block) is pretty fast!)
Update: See also pack, unpack, perlpacktut.
Give a man a fish: <%-{-{-{-<
| [reply] [d/l] [select] |
|
|
Yes, it is a fairly nice way of doing it.++
I had also been thinking about some similar form of GR-like transform (though I had not thought about using the maxstr function of List::Until), but I finally preferred to make my algorithmic point with a simple basic straight-forward and manual search of the maximum date.
I also agree that the various ways of doing that have little consequence on performance unless the list is really very long.
| [reply] [d/l] [select] |
Re: Sort directories by date
by karlgoethebier (Abbot) on Sep 29, 2015 at 18:25 UTC
|
As RichardK wrote above:
"...It would have been much easier if you'd created the directory names as 'YYYYMMDD' then they would have sorted naturally with little effort...."
But i guess that it is like it is: You can't change that any more.
Here is another idea how to sort these directories that doesn't require any dark klingone maneuvers ;-)
#!/usr/bin/env perl
use strict;
use warnings;
use feature qw (say);
my @dates = qw{
12112014
01052015
02202015
03102015
01012011
04092015
09092015
};
my %hash;
for (@dates) {
my ( $m, $d, $y ) = unpack q(a2a2a4);
$hash{qq($y$m$d)} = $_;
}
for ( sort { $a <=> $b } keys(%hash) ) {
say $hash{$_};
}
__END__
karls-mac-mini:monks karl$ ./dirs.pl
01012011
12112014
01052015
02202015
03102015
04092015
09092015
Please see also Path::Iterator::Rule, File::Basename, unpack, pack as well as perlpacktut.
Regards, Karl
«The Crux of the Biscuit is the Apostrophe»
| [reply] [d/l] |
|
|
Yeah, but why would you want to populate a hash and use sort, when you already have everything at hand and only need one very small extra step to find the max value?
#!/usr/bin/env perl
use strict;
use warnings;
use feature qw (say);
my @dates = qw{
12112014
01052015
02202015
03102015
01012011
04092015
09092015
};
my $max_date = 0;
my $result;
for (@dates) {
my $date = join "", reverse unpack q(a4a4);
$result = $_ and $max_date = $date if $date > $max_date;
}
print $result;
Update: fixed a mistake (missing reverse) in the my $date = ... code line above. Thanks to poj for pointing out the error.
| [reply] [d/l] [select] |
|
|
use strict;
use warnings;
use feature qw (say);
use List::Util qw(max);
use Time::Piece;
my @dates = qw{
12112014
01052015
02202015
03102015
01012011
04092015
09092015
};
say localtime( max map { Time::Piece->strptime( $_, "%m%d%Y" )->epoch
+} @dates )
->strftime("%m%d%Y");
__END__
\Desktop\monks>other_idea.pl
09092015
Best regards, Karl
«The Crux of the Biscuit is the Apostrophe»
| [reply] [d/l] |
Re: Sort directories by date
by Anonymous Monk on Sep 28, 2015 at 22:50 UTC
|
#!/usr/bin/perl
# http://perlmonks.org/?node_id=1143273
use strict;
use warnings;
my @dirs = qw(
12112014
01052015
02202015
03102015
01012011
04092015
09092015
);
my $latest = (sort { $a % 1e4 <=> $b % 1e4 } sort @dirs)[-1];
print "latest dir $latest\n";
Because sort is stable :)
| [reply] [d/l] |
|
|
return [ glob "/my_dir/$latest/*.txt" ];
}
(untested)
| [reply] [d/l] |
|
|
sub direc
{
my $dir = "/my_dir";
my @dirs = map m[/(\d{8})\z], glob "/$my_dir/*";
my $latest = (sort { $a % 1e4 <=> $b % 1e4 } sort @dirs)[-1];
[ glob "/my_dir/$latest/*.txt" ];
}
(untested)
| [reply] [d/l] |
Re: Sort directories by date
by locked_user sundialsvc4 (Abbot) on Sep 28, 2015 at 19:58 UTC
|
Or the short-answer, for those who don’t want to answer a quiz to get it, would be ... to use a sort-compare subroutine, inline or otherwise, along these lines: (untested)
sort {
substr($a, 4, 4) cmp substr($b, 4, 4)
||
substr($a, 2, 2) cmp substr($b, 2, 2)
||
substr($a, 0, 2) cmp substr($b, 0, 2)
} ...
The sort verb accepts as its first argument a function that, given two “magic variables” $a and $b, must return a value that is less than, equal to, or greater than zero. The <=> (numeric) and cmp (string) operators are specifically designed for this purpose. Here, in a simple in-line subroutine, we use the || logic-OR operator, which we know uses “short circuiting,” to return the first of three alternatives that is not zero. First, we compare the year. Then, the month, then the day. (The first position in a Perl string is position zero.)
| |
|
|
Unfortunately, you have your month and day fields transposed as the third date in the list shows that the format is MMDDYYYY, there not being a month 20 in the year!
If there are many directories to sort there might be some benefit in using a more advanced sorting approach rather than repeatedly substr'inging the same fields as each date is compared with others in turn. I have used unpack as an alternative to substr in the following code. A Schwartzian transform:-
$ perl -Mstrict -Mwarnings -E '
my @dates = qw{
12112014
01052015
02202015
03102015
01012011
04092015
09092015
};
say for
map { $_->[ 0 ] }
sort { $a->[ 3 ] <=> $b->[ 3 ]
||
$a->[ 1 ] <=> $b->[ 1 ]
||
$a->[ 2 ] <=> $b->[ 2 ]
}
map { [ $_, unpack q{a2a2a4}, $_ ] }
@dates;'
01012011
12112014
01052015
02202015
03102015
04092015
09092015
$
Guttman Rosler transform:-
$ perl -Mstrict -Mwarnings -E '
my @dates = qw{
12112014
01052015
02202015
03102015
01012011
04092015
09092015
};
say for
map { substr $_, 8 }
sort
map { join q{}, ( unpack q{a2a2a4}, $_ )[ 2, 0, 1 ], $_ }
@dates;'
01012011
12112014
01052015
02202015
03102015
04092015
09092015
$
I hope this is of interest.
| [reply] [d/l] [select] |