Re: Sort text string by the date embedded

Your post would be much more readable if you enclosed the code withing <code>...</code tags.

The date format that you have is close to being able to be sorted by an alphanumeric comparison because you have 4 digit years and the months and days have leading zeroes (always are 2 digits).

So in the sort, just reorder the string into the right order and use a single cmp instruction.

#!/usr/bin/perl -w
use strict;

my @strings = (
   "PROCESS_DT IN '01/01/2009'", 
   "PROCESS_DT IN '05/23/2006'", 
   "PROCESS_DT IN '01/01/2011'", 
   "PROCESS_DT IN '04/19/2009'", 
   "PROCESS_DT IN '07/01/2009'", );
   
   
@strings = sort { 
                   my ($monthA, $dayA, $yearA) = $a =~ m|(\d+)/(\d+)/(
+\d+)|;
                   my ($monthB, $dayB, $yearB) = $b =~ m|(\d+)/(\d+)/(
+\d+)|;
                          "$yearA$monthA$dayA" cmp "$yearB$monthB$dayB
+"
                }@strings;

print join("\n",@strings),"\n";

__END__
PROCESS_DT IN '05/23/2006'
PROCESS_DT IN '01/01/2009'
PROCESS_DT IN '04/19/2009'
PROCESS_DT IN '07/01/2009'
PROCESS_DT IN '01/01/2011'
[download]

Comment on Re: Sort text string by the date embedded Select or Download Code

Replies are listed 'Best First'.
Re^2: Sort text string by the date embedded by and_noel (Initiate) on Oct 17, 2011 at 17:19 UTC
This worked perfectly and was simple. Thanks for the fast response	[reply]
Re^2: Sort text string by the date embedded by AR (Friar) on Oct 17, 2011 at 16:53 UTC
This would be a great occasion to use the Schwartzian Transform.	[reply]
Re^3: Sort text string by the date embedded by Marshall (Canon) on Oct 18, 2011 at 00:14 UTC
Well, that depends upon a number of factors. The ST pre-calculates what I did with regex match and saves it for later use - this requires more memory copies and allocation - and then the transformation back into the original array. Over the years, I've done benchmarks with the ST and without. What I have found is that this is not as important as it used to be. Perl's sort algorithm has gone through improvements and in particular the worst case performance has increased dramatically due to merge sort vs quick sort. And I think that there have been other improvements "under the hood" that have made sort way faster than it once was. In a case like this, I would not at a first blush worry about performance with N=100 or even N=1,000. With N=10,000 I would think about it. So with an array of 80,000 things, an ST is clearly going to be worthwhile if performance matters. With 1,000 it is often about a toss up. I would say that the vast, vast majority of sorts that I do in Perl are on less than 100 things. I start thinking about performance considerations at about 1,000. For what it is worth those are my "rules of thumb". Now part of this does have to do with "how expensive" it is to extract the relevant comparison data from $a and $b. In the OP's question, I didn't have to call any fancy date/time modules - just very simple single regex got the job done. That matters. I don't have to save the result of a computation for use later if that computation wasn't "expensive" to begin with. And again, if I save that result, I have to make a new data structure, sort that, and then re-construct the original thing. For 100 things or less (most sorts), I don't see it. For sorting 1,000 things ST is worth thinking about. For sorting 10,000+ things you probably should be doing it.	[reply]