This is more heavily into date parsing and I think there are probably modules better suited to this (e.g., Date::Parse?), but here's an example of a structured approach using a nifty regex feature from 5.10+ (caution: this code is obviously not exhaustively tested):

>perl -wMstrict -le "my @dates = ( 'Mar 11 08:02:08', '11 Mar 08:02:08', 'Mar 11 08:02:08.32', '11 Mar 08:02:08.32', 'Mar 11 2011 08:02:08', '2011 Nov 11 08:02:08', 'Mar 11 2011 08:02:08.32', '11 Dec 2011 08:02:08', '--------------------', 'Mar 32 2011 08:02:08', '2011 08:02:08', 'Mar 11 2011 .00', ); ;; my @months = qw(Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec); my $mname = qr{ (?<MON> @{[ join '|', @months ]}) }xms; my $mday = qr{ (?<DAY> 0 [1-9] | [12] \d | 3 [01]) }xms; my $day = qr{ $mname \s+ $mday | $mday \s+ $mname }xms; my $year = qr{ (?<YEAR> (?: 19 | 20) \d\d) }xms; my $date = qr{ $day (?: \s+ $year)? | (?: $year \s+)? $day }xms; my $hms = qr{ (?<HMS> \d\d (?: : \d\d){2}) }xms; my $hund = qr{ (?<HUND> \. \d{2}) }xms; my $time = qr{ $hms $hund? }xms; ;; DATE: for my $d (@dates) { my $parsed = $d =~ m{ \A $date \s+ $time \z }xms; if (not $parsed) { warn qq{bad date: '$d'}; next DATE; } my $day = $+{DAY}; my $mon = $+{MON}; my $yr = $+{YEAR} || '1999'; my $t = $+{HMS}; my $h = $+{HUND} || '.00'; my $canonical_date = qq{$mon $day $yr $t$h}; printf qq{%-25s -> '%s' \n}, qq{'$d'}, $canonical_date; } " 'Mar 11 08:02:08' -> 'Mar 11 1999 08:02:08.00' '11 Mar 08:02:08' -> 'Mar 11 1999 08:02:08.00' 'Mar 11 08:02:08.32' -> 'Mar 11 1999 08:02:08.32' '11 Mar 08:02:08.32' -> 'Mar 11 1999 08:02:08.32' 'Mar 11 2011 08:02:08' -> 'Mar 11 2011 08:02:08.00' '2011 Nov 11 08:02:08' -> 'Nov 11 2011 08:02:08.00' 'Mar 11 2011 08:02:08.32' -> 'Mar 11 2011 08:02:08.32' '11 Dec 2011 08:02:08' -> 'Dec 11 2011 08:02:08.00' bad date: '--------------------' at -e line 1. bad date: 'Mar 32 2011 08:02:08' at -e line 1. bad date: '2011 08:02:08' at -e line 1. bad date: 'Mar 11 2011 .00' at -e line 1.

Update: The central parsing regex above was originally
    m{ \A $date \s+ $time $hund? \z }xms
but should have been and is now
    m{ \A $date \s+ $time \z }xms
The  $hund? was completely redundant and this fix produces no change in the output.


In reply to Re: regex for multiple dates by AnomalousMonk
in thread regex for multiple dates by Anonymous Monk

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.