Beefy Boxes and Bandwidth Generously Provided by pair Networks
No such thing as a small change
 
PerlMonks  

Re^2: How to get minimum start date in these start dates ?

by JediWizard (Deacon)
on Jul 10, 2006 at 14:17 UTC ( [id://560146]=note: print w/replies, xml ) Need Help??


in reply to Re: How to get minimum start date in these start dates ?
in thread How to get minimum start date in these start dates ?

cog While I agree that a Schwartzian transfrom is a good idea here, I would like to make two small comments.

1. Rather than doing three comparisons, first on year, then month, then day, I believe (though I haven't benchmarked it) that it would be faster to do a single comaprison of a string in the form yyyymmdd, which can be easily created with a regex.

2. Depending on your data set, it maybe considerable faster to use an Orcish manouver. Especially if the same date may appear multiple time in the list (and I didn't nessicarily see anthing in the post to indicate that that wouldn't happen).

my(%date_hash) = (); my(@start_date) = qw(01-06-2007 01-08-2006 01-06-2006 01-07-2007 06-01 +-2007); my @sorted_dates = sort({ ($date_has{$a} ||= &trans_date($a)) <=> ($date_has{$b} ||= &trans_date($b)) } @start_date); print join("\n", @sorted_dates); sub trans_date { my $date = shift; $date =~ s/(\d{2})-(\d{2})-(\d{4})/$3$2$1/; return $date; }

They say that time changes things, but you actually have to change them yourself.

—Andy Warhol

Replies are listed 'Best First'.
Re^3: How to get minimum start date in these start dates ?
by cog (Parson) on Jul 10, 2006 at 15:10 UTC
    While I don't disagree with you, I find the Schwartzian transform easier to understand and memorize than an Orcish Manouver, from the view point of a newbie.

    Also, the benchmarking would depend largely on the data set (suppose all the years are different, for instance).

    Still, I'm inclined to believe that speed won't be relevant, in this case :-) Just a hunch, you know? :-)

      I agree with you cog regarding the the ST over the OM but that's probably because I've never used the OM in anger so I'm not familiar with it. I think that both JediWizard's solution and yours overcomplicate the transformation of the date into a sortable form. Just reversing the date to sort it and then reversing it again to extract it seems much simpler and quicker to me. I have done some benchmarking which seems to bear this out. I've also corrected a couple of typos (you had missed a closing quote in one of your hash keys but I've unquoted them all and JediWizard had doubled his quote words like qw(qw( ... )). Here is the code

      use strict; use warnings; use Benchmark qw(cmpthese); # Generate a thousand dates at random. # my @startDates; push @startDates, sprintf(q{%02d}, int((rand 28) + 1)) . q{-} . sprintf(q{%02d}, int((rand 12) + 1)) . q{-} . int((rand 25) + 2000) for (1 .. 1000); # cog's method. # my $rcCog = sub { my @sortedDates = map { $_->{date} } sort { $a->{year} <=> $b->{year} or $a->{month} <=> $b->{month} or $a->{day} <=> $b->{day} } map { /(\d\d)-(\d\d)-(\d\d\d\d)/; { date => $_, day => $1, month => $2, year => $3 } } @startDates; return $sortedDates[0]; }; # JediWizard's method. # my $rcJediWizard = sub { my %dateHash = (); my @sortedDates = sort { ($dateHash{$a} ||= transDate($a)) <=> ($dateHash{$b} ||= transDate($b)) } @startDates; return $sortedDates[0]; }; # johngg's method. # my $rcJohnGG = sub { return ( map {join q{-}, reverse split /-/} sort map {join q{-}, reverse split /-/} @startDates )[0]; }; # Run all three on data to prove they come up with # the same answer. # print q{$rcCog->() - }, $rcCog->(), qq{\n}; print q{$rcJediWizard->() - }, $rcJediWizard->(), qq{\n}; print q{$rcJohnGG->() - }, $rcJohnGG->(), qq{\n}; # Run the benchmark # cmpthese (50, { Cog => $rcCog, JediWizard => $rcJediWizard, JohnGG => $rcJohnGG }); # JediWizard's date translation routine. # sub transDate { my $date = shift; $date =~ s/(\d{2})-(\d{2})-(\d{4})/$3$2$1/; return $date; }

      And these are the results

      $rcCog->() - 13-01-2000 $rcJediWizard->() - 13-01-2000 $rcJohnGG->() - 13-01-2000 Rate JediWizard Cog JohnGG JediWizard 6.00/s -- -0% -61% Cog 6.00/s 0% -- -61% JohnGG 15.2/s 153% 153% --

      Looks like your hunch about speed was correct in that you and JediWizard pan out about the same (seems to go either way over several runs but the one I captured here was a dead heat). However, my simpler solution appears to be consistently quicker.

      I hope this is of interest.

      Cheers,

      JohnGG

        "I've never used the OM in anger"

        Niether have I. I'm actually a bit curious about what one might look like if written "in anger".

        ;-)


        They say that time changes things, but you actually have to change them yourself.

        —Andy Warhol

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://560146]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others rifling through the Monastery: (5)
As of 2024-03-28 22:24 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found