sidsinha has asked for the wisdom of the Perl Monks concerning the following question:

Hi,

How could I do a multi sort i.e first sort by date, then sort by time, then print only the entries for todays date from the file

Below is how the file looks like:
Date Time Tag 08-13-2013 19:22:16 Yes 08-13-2013 18:22:17 No 08-13-2013 21:22:17 Yes 08-13-2013 20:22:16 Yes 08-13-2013 23:22:18 Yes 08-13-2013 22:22:17 No 08-14-2013 01:22:17 Yes 08-14-2013 00:22:18 Yes 08-14-2013 03:22:19 No 08-14-2013 02:22:18 Yes 08-14-2013 05:22:28 No 08-14-2013 04:22:29 Yes 08-14-2013 07:22:19 Yes 08-14-2013 06:22:18 Yes 08-14-2013 09:22:19 No 08-14-2013 08:22:19 Yes 08-14-2013 11:22:19 Yes 08-14-2013 10:22:20 No 08-14-2013 13:22:20 Yes 08-14-2013 12:22:20 No 08-14-2013 15:22:21 Yes 08-14-2013 14:22:20 Yes 08-14-2013 17:22:21 Yes 08-14-2013 16:22:22 No
The output needs to be sorted by date first and then by time.Thanks

Replies are listed 'Best First'.
Re: Date plus Time sort from file
by tobyink (Canon) on Aug 15, 2013 at 07:33 UTC

    Just as a general point, rather than "sort then filter", it's generally a better idea to "filter then sort". This is because sorting is a comparatively slow operation; filtering first reduces the size of the list that needs to be sorted, making the sorting faster.

    You won't notice any difference on a list with 25 items, but if you've got thousands, it can make a significant difference.

    package Cow { use Moo; has name => (is => 'lazy', default => sub { 'Mooington' }) } say Cow->new->name

      Yes, I fully agree with tobyink, filtering first and then only sorting can generally improve performance dramatically because the program will have much less work to do (at least if you have many records). In this specific case, filtering on today's date offers another major advantage: since all the dates will be the same, you really have to sort on time, and this is much simpler and easier thah sorting dates in your given format.

      If you had to sort on dates and then times, you would need to split the records in order to sort first the years, then the months, then the days, and then the time (or possibly apply some other type of transformation to your data). Here, since all the dates will have been filtered and will therefore be the same, a simple sort on the full record will produce the desired result.

      So you basically need something as simple as this:

      my @sorted_records = sort grep {/^08-14-2013/} @unsorted_records;

      Of course, in real life, you probably don't want to hardcode the date (08/14/2013) in your filtering. If you want to find today's date in your specific date format, you can do something like this:

      my ($day, $month, $year) = (localtime time)[3..5]; $year += 1900; $day = sprintf "%02d", $day: $month = sprintf "%02d", $month + 1; my $date = "$month-day-$year";

      Or you could use one of the numerous date modules available on the CPAN. Or the strftime POSIX function illustrated above by boftx.

        Based on what was just said, my example can be simplified even further:

        # this is what should be used instead of the hard-coded value: # my $today = strftime("%m-%d-%Y",localtime()); my $today = '08-14-2013'; my @todays; for ( @orig_data ) { next unless m/^($today )/; push(@todays,$_); } # NOTE! this will reverse the order of Yes and No if they have the # same time. @todays = sort(@todays); for ( @todays ) { print "$_\n"; }

        I agree that filtering then sorting will result in (sometimes large) performance gains. In this particular case, where the unsorted array is actually records coming from a file, you only want to store the records of interest in memory instead of slurping in the entire file. RAM might be cheap, but it is a finite resource and SAs can be a nasty breed of cat. (Remember the BOFH?)

Re: Date plus Time sort from file
by Anonymous Monk on Aug 15, 2013 at 01:57 UTC
      #!/usr/bin/perl -- use strict; use warnings; use Time::Piece; Main( @ARGV ); exit( 0 ); sub Main { my( $now ) = @_; ## $now ||= Time::Piece::localtime->strftime('%m-%d-%Y'); $now ||= "08-13-2013"; my $raw = 'Date Time Tag 08-13-2013 21:22:17 Yes 08-13-2013 22:22:17 No 08-14-2013 11:22:17 Yes 08-13-2012 21:22:17 Yes 08-13-2011 22:22:17 No 08-14-2012 01:22:17 Yes ';;;; my( $header, $todaytes ) = rubberBiscuit( $now , \$raw ); print $header; print "$_\n" for @$todaytes ; } sub rubberBiscuit { my( $now, $file ) = @_; use autodie; open my($in), '<', $file ; ## or die by autodie my @today; my $header = readline $in; while( readline $in ){ my($date, $time, $tag ) = split ' '; if( $date eq $now ){ push @today, join ' ', Time::Piece->strptime( "$date $time", '%m-%d-%Y %H:%M:%S', )->strftime('%Y-%m-%d %H:%M:%S'), $tag, ;;;;; } } @today = sort @today; return $header, \@today; } __END__ Date Time Tag 2013-08-13 21:22:17 Yes 2013-08-13 22:22:17 No

        I think this can be simplified somewhat. The following just demonstrates the basic selection/sort logic without regard to how you want to handle the file itself.

        #!/usr/bin/perl use strict; use POSIX qw(strftime); my @orig_data = ( 'Date Time Tag', '08-13-2013 19:22:16 Yes', '08-13-2013 18:22:17 No', '08-13-2013 21:22:17 Yes', '08-13-2013 20:22:16 Yes', '08-13-2013 23:22:18 Yes', '08-13-2013 22:22:17 No', '08-14-2013 01:22:17 Yes', '08-14-2013 00:22:18 Yes', '08-14-2013 03:22:19 No', '08-14-2013 02:22:18 Yes', '08-14-2013 05:22:28 No', '08-14-2013 04:22:29 Yes', '08-14-2013 07:22:19 Yes', '08-14-2013 06:22:18 Yes', '08-14-2013 09:22:19 No', '08-14-2013 08:22:19 Yes', '08-14-2013 11:22:19 Yes', '08-14-2013 10:22:20 No', '08-14-2013 13:22:20 Yes', '08-14-2013 12:22:20 No', '08-14-2013 15:22:21 Yes', '08-14-2013 14:22:20 Yes', '08-14-2013 17:22:21 Yes', '08-14-2013 16:22:22 No', ); # this should be used instead of the hard-coded value below: # my $today = strftime("%m-%d-%Y",localtime()); my $today = '08-14-2013'; my @todays; for ( @orig_data ) { my ($date,$other) = split(/\s/,$_,2); next unless $date eq $today; push(@todays, "$other"); } # NOTE! this will reverse the order of Yes and No if they have the sam +e time. @todays = sort(@todays); for ( @todays ) { print "$today $_\n"; } __END__
Re: Date plus Time sort from file
by soonix (Chancellor) on Aug 15, 2013 at 09:17 UTC
    In addition to what tobyink and Laurent_R just said: in your case, when you are sorting data from just one day (or even one month), you can sort by the default (complete record) without splitting or transforming. Of course (epending on how long your records are), sort might lose - performancewise - what you gain by omitting those steps, but even then
    • if you need to run this just once, run time does not matter as much as programming time
    • if specifications/requirements change relatively often (or are not yet stable), it is more important to have code that is easy to understand than code that is efficient
    .
Re: Date plus Time sort from file
by boftx (Deacon) on Aug 15, 2013 at 09:58 UTC

    Of course, what is really being done is this:

    $ grep `date +%m-%d-%Y` datafile.txt | sort >data_`date +%m-%d-%Y`.txt

    But who wants to use a scripting language? :)