Re: Combining s///
by Corion (Patriarch) on Mar 09, 2008 at 17:21 UTC
|
When dealing with dates, I always convert them to YYYYMMDD, the only sane form.
for my $date (@dates) {
my @parts = split qr!/!, $date;
if ($parts[2] < 50) { # arbitrary cutoff
$parts[2] += 2000
} else {
$parts[2] += 1900
};
$date = join "", @parts[2,0,1];
};
@sorted = sort @dates;
Conversion back to the other date format is left as an exercise to the reader | [reply] [d/l] [select] |
Re: Combining s///
by igelkott (Priest) on Mar 09, 2008 at 19:43 UTC
|
Absolutely agree with Corion, YYYYMMDD is clearly the preferred intermediate format (if not final as well).
The only small thing to add is that you might consider one of the many Date modules (Date::Calc, Date::Manip, Time::Piece, etc) if you want to do more than just sort them (eg, find the difference between these dates). | [reply] |
Re: Combining s///
by johngg (Canon) on Mar 09, 2008 at 23:50 UTC
|
Assuming you want to keep the dates in MM/DD/YY format, you could use a sort of Schwartzian Transform to do the sorting. The original date string is carried through the process unchanged and pulled out at the end (in the top map) and the month day and year are separated out so they are available to sort in the first (bottom-most) map. The middle map transforms the year so that (arbitrarily) any year >= 30 belongs to the 20th century and the rest belong to the 21st.
use strict;
use warnings;
my $cutoff = 30;
print
map { $_->[0] }
sort {
$a->[3] cmp $b->[3]
||
$a->[1] cmp $b->[1]
||
$a->[2] cmp $b->[2]
}
map { $_->[3] += $_->[3] >= $cutoff ? 1900 : 2000; $_ }
map { [ $_, m{(\d\d)/(\d\d)/(\d\d)} ] }
<DATA>;
__END__
12/24/99
09/06/04
12/03/99
06/24/99
10/17/98
04/24/99
The output.
10/17/98
04/24/99
06/24/99
12/03/99
12/24/99
09/06/04
I hope this is of interest. Cheers, JohnGG | [reply] [d/l] [select] |
|
|
Since you're transforming anyway, might as well put it in a more easily sortable format!
map { $_->[0] }
sort { $a->[1] cmp $b->[1] }
map { my ($m, $d, $y) = m{(\d\d)/(\d\d)/(\d\d)};
$y += y >= $cutoff ? 1900 : 2000;
[ $_, sprintf('%04d/%02d/%02d', $y, $m, $d) ]
}
Or for extra speed, avoid creating all those arrays and references.
map { substr($_, 10) }
sort
map { my ($m, $d, $y) = m{(\d\d)/(\d\d)/(\d\d)};
$y += $y >= $cutoff ? 1900 : 2000;
sprintf('%04d/%02d/%02d', $y, $m, $d) . $_
}
Update: Added missing $. | [reply] [d/l] [select] |
Re: Combining s///
by starbolin (Hermit) on Mar 09, 2008 at 19:39 UTC
|
Interesting question. Took me a lot longer than I expected. Did you know you can have perl statements inside your regex?
#!/usr/bin/perl
for (@ARGV) {
my $d;
s! (\d{2})/
(\d{2})/
(\d{2})
(?{ $d = ($3 gt 55)?'19'.$3:'20'.$3;})
!$d/$1/$2!ox;
print; print "\n";
}
%./foo 01/02/03 02/14/08 01/02/89
2003/01/02
2008/02/14
1989/01/02
s//----->\t/;$~="JAPH";s//\r<$~~/;{s|~$~-|-~$~|||s
|-$~~|$~~-|||s,<$~~,<~$~,,s,~$~>,$~~>,,
$|=1,select$,,$,,$,,1e-1;print;redo}
| [reply] [d/l] [select] |
|
|
More importantly, did you know you can have the substitute expression treated as Perl code using the e modifier?
for (@ARGV) {
s! (\d{2})/
(\d{2})/
(\d{2})
!
my $y = ( $3 > 55 ? '19' : '20' ) . $3;
"$y/$1/$2"
!xeg;
print "$_\n";
}
More tips:
- gt is to compare strings. > is to compare numbers.
- Don't use the o modifier. It only adds disadvantages (no advantages) nowadays.
- Use local our $var; instead of my $var; for vars outside of regexps used in (?{}) and (??{}). It doesn't always make a difference, but it'll save yourself from subtle errors.
- Name a variable holding a year something other than $d.
| [reply] [d/l] [select] |
|
|
I didn't know about the 'e' modifier. That would have saved a me a lot of time. Thanks ikegami.
PS: $d stood in for 'date' and was going to be verbosed later, but later, as is it's want, never showed up.
s//----->\t/;$~="JAPH";s//\r<$~~/;{s|~$~-|-~$~|||s
|-$~~|$~~-|||s,<$~~,<~$~,,s,~$~>,$~~>,,
$|=1,select$,,$,,$,,1e-1;print;redo}
| [reply] [d/l] |
Re: Combining s///
by locked_user sundialsvc4 (Abbot) on Mar 10, 2008 at 14:10 UTC
|
My take on this matter is somewhat ruled by the philosophy “Dictum Ne Agas: Do Not Do a Thing Already Done.”
“Date manipulation,” of any sort, definitely qualifies as “a thing already done.” Therefore, instead of writing Yet Another One, surf through CPAN and look for any one of a dozen good, highly-rated date manipulation packages. They will imbue your code not only with the ability to do date-comparison (for use with Perl's existing sort-capability), but also the ability to support any date-format. The heavy-lifting has been pushed out of our code and into the purpose-built CPAN routines. Needless to say, they solve the “Y2K problem” in industry-standard accepted ways. It all just works.
Honestly, with so many thousands of great routines out there, your first step in handling almost any problem is to figure out which ready-made solution you can pick off the shelf, whack-on just a few times to get it to do exactly what you want, then just drop-in and use... The fastest way to do any job is to not have to do it at all.
| |
|
|
Globally, I agree with sundialsvc4 and would strongly encourage following the good monk's advice. However, it's hard to beat the simplicity and compactness of starbolin and ikegami's approaches. Saves loading the CPAN modules and, for those like myself who can't use CPAN modules on our control systems, it is sometimes the only way (Don't get me started on why we can't you use CPAN modules...it's a sore point with me that our IT-Nazi's impose such restrictions). Plus, the provided solutions teach a lot about the oft-misunderstood regexes and the regex engine. I really liked starbolin's and ikegami's compact solution. I had tried to do the same thing, but couldn't figure out how to get the Perl code into the substitution portion of the s/// regex. Thanks, starbolin and ikegami.
| [reply] |