comment on

This is more heavily into date parsing and I think there are probably modules better suited to this (e.g., Date::Parse?), but here's an example of a structured approach using a nifty regex feature from 5.10+ (caution: this code is obviously not exhaustively tested):

>perl -wMstrict -le
"my @dates = (
   'Mar 11 08:02:08',          '11 Mar 08:02:08',
   'Mar 11 08:02:08.32',       '11 Mar 08:02:08.32',
   'Mar 11 2011 08:02:08',     '2011 Nov 11 08:02:08',
   'Mar 11 2011 08:02:08.32',  '11 Dec 2011 08:02:08',
   '--------------------',
   'Mar 32 2011 08:02:08',  '2011 08:02:08',  'Mar 11 2011 .00',
   );
 ;;
 my @months = qw(Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec);
 my $mname = qr{ (?<MON> @{[ join '|', @months ]})           }xms;
 my $mday  = qr{ (?<DAY> 0 [1-9] | [12] \d | 3 [01])         }xms;
 my $day   = qr{ $mname \s+ $mday | $mday \s+ $mname         }xms;
 my $year  = qr{ (?<YEAR> (?: 19 | 20) \d\d)                 }xms;
 my $date  = qr{ $day (?: \s+ $year)? | (?: $year \s+)? $day }xms;
 my $hms   = qr{ (?<HMS> \d\d (?: : \d\d){2})                }xms;
 my $hund  = qr{ (?<HUND> \. \d{2})                          }xms;
 my $time  = qr{ $hms $hund?                                 }xms;
 ;;
 DATE:
 for my $d (@dates) {
   my $parsed = $d =~ m{ \A $date \s+ $time \z }xms;
   if (not $parsed) {
     warn qq{bad date: '$d'};
     next DATE;
     }
   my $day = $+{DAY};
   my $mon = $+{MON};
   my $yr  = $+{YEAR} || '1999';
   my $t   = $+{HMS};
   my $h   = $+{HUND} || '.00';
   my $canonical_date = qq{$mon $day $yr $t$h};
   printf qq{%-25s -> '%s' \n}, qq{'$d'}, $canonical_date;
   }
"
'Mar 11 08:02:08'         -> 'Mar 11 1999 08:02:08.00'
'11 Mar 08:02:08'         -> 'Mar 11 1999 08:02:08.00'
'Mar 11 08:02:08.32'      -> 'Mar 11 1999 08:02:08.32'
'11 Mar 08:02:08.32'      -> 'Mar 11 1999 08:02:08.32'
'Mar 11 2011 08:02:08'    -> 'Mar 11 2011 08:02:08.00'
'2011 Nov 11 08:02:08'    -> 'Nov 11 2011 08:02:08.00'
'Mar 11 2011 08:02:08.32' -> 'Mar 11 2011 08:02:08.32'
'11 Dec 2011 08:02:08'    -> 'Dec 11 2011 08:02:08.00'
bad date: '--------------------' at -e line 1.
bad date: 'Mar 32 2011 08:02:08' at -e line 1.
bad date: '2011 08:02:08' at -e line 1.
bad date: 'Mar 11 2011 .00' at -e line 1.
[download]

Update: The central parsing regex above was originally
m{ \A $date \s+ $time $hund? \z }xms
but should have been and is now
m{ \A $date \s+ $time \z }xms
The $hund? was completely redundant and this fix produces no change in the output.

In reply to Re: regex for multiple dates by AnomalousMonk
in thread regex for multiple dates by Anonymous Monk

Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!

Titles consisting of a single word are discouraged, and in most cases are disallowed outright.

Read Where should I post X? if you're not absolutely sure you're posting in the right place.

Please read these before you post! —

Posts may use any of the Perl Monks Approved HTML tags:

a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr

You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)

	For:		Use:
	&		`&`
	<		`<`
	>		`>`
	[		`[`
	]		`]`

Link using PerlMonks shortcuts! What shortcuts can I use for linking?

See Writeup Formatting Tips and other pages linked from there for more info.