This is more heavily into date parsing and I think there are probably modules better suited to this (e.g., Date::Parse?), but here's an example of a structured approach using a nifty regex feature from 5.10+ (caution: this code is obviously not exhaustively tested):
>perl -wMstrict -le "my @dates = ( 'Mar 11 08:02:08', '11 Mar 08:02:08', 'Mar 11 08:02:08.32', '11 Mar 08:02:08.32', 'Mar 11 2011 08:02:08', '2011 Nov 11 08:02:08', 'Mar 11 2011 08:02:08.32', '11 Dec 2011 08:02:08', '--------------------', 'Mar 32 2011 08:02:08', '2011 08:02:08', 'Mar 11 2011 .00', ); ;; my @months = qw(Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec); my $mname = qr{ (?<MON> @{[ join '|', @months ]}) }xms; my $mday = qr{ (?<DAY> 0 [1-9] | [12] \d | 3 [01]) }xms; my $day = qr{ $mname \s+ $mday | $mday \s+ $mname }xms; my $year = qr{ (?<YEAR> (?: 19 | 20) \d\d) }xms; my $date = qr{ $day (?: \s+ $year)? | (?: $year \s+)? $day }xms; my $hms = qr{ (?<HMS> \d\d (?: : \d\d){2}) }xms; my $hund = qr{ (?<HUND> \. \d{2}) }xms; my $time = qr{ $hms $hund? }xms; ;; DATE: for my $d (@dates) { my $parsed = $d =~ m{ \A $date \s+ $time \z }xms; if (not $parsed) { warn qq{bad date: '$d'}; next DATE; } my $day = $+{DAY}; my $mon = $+{MON}; my $yr = $+{YEAR} || '1999'; my $t = $+{HMS}; my $h = $+{HUND} || '.00'; my $canonical_date = qq{$mon $day $yr $t$h}; printf qq{%-25s -> '%s' \n}, qq{'$d'}, $canonical_date; } " 'Mar 11 08:02:08' -> 'Mar 11 1999 08:02:08.00' '11 Mar 08:02:08' -> 'Mar 11 1999 08:02:08.00' 'Mar 11 08:02:08.32' -> 'Mar 11 1999 08:02:08.32' '11 Mar 08:02:08.32' -> 'Mar 11 1999 08:02:08.32' 'Mar 11 2011 08:02:08' -> 'Mar 11 2011 08:02:08.00' '2011 Nov 11 08:02:08' -> 'Nov 11 2011 08:02:08.00' 'Mar 11 2011 08:02:08.32' -> 'Mar 11 2011 08:02:08.32' '11 Dec 2011 08:02:08' -> 'Dec 11 2011 08:02:08.00' bad date: '--------------------' at -e line 1. bad date: 'Mar 32 2011 08:02:08' at -e line 1. bad date: '2011 08:02:08' at -e line 1. bad date: 'Mar 11 2011 .00' at -e line 1.
Update: The central parsing regex above was originally
m{ \A $date \s+ $time $hund? \z }xms
but should have been and is now
m{ \A $date \s+ $time \z }xms
The $hund? was completely redundant and this fix produces no change in the output.
In reply to Re: regex for multiple dates
by AnomalousMonk
in thread regex for multiple dates
by Anonymous Monk
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |