Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

Hi monks,

(This is little bit related to the $1 vs $+ post, but since I am asking different thing, I split it into 2 post)

I have this complex regex to do the date matching

$s =~ s/\ ((0?[1-9]|1[0-2])\/(0?[1-9]|[1-2][0-9]|3[0-1])\/(19|20)?[0-9 +][0-9](\s(((0?[0-9]|1[0-9]|2[0-3]):[0-5][0-9](:[0-5][0-9])?)|((0?[0-9 +]|1[0-2]):[0-5][0-9](:[0-5][0-9])?\s(AM|PM))))?)\ /\n$1\n/g;

it work fine with slash, but I want to extend it into hypen and even comma. The solution I see is to extend the \/ into character class or OR matching. I can see it can work fine for 80% condition, but I also expect the incoming file is contains date with messive code, so those thing may appear

03/20, 2008

how can I make sure the second delimiter must same as previous one?

Thank you.

Replies are listed 'Best First'.
Re: The first match affect the following of the regex
by grep (Monsignor) on Mar 30, 2008 at 18:51 UTC
    I have two comments:
    1. Yes you most likely want character classes, but use them in some splits. This screams split, you are taking a string and splitting it into different pieces.
    2. Why aren't you using a module for this? Data::Calc and DateTimeX::Easy are both excellent choices. Using one of them you're guaranteed that either:
      • The module can parse the string - you will get either a date object or and array of the date tokens.
      • If it can't parse it - you get an error.
    grep
    One dead unjugged rabbit fish later...
Re: The first match affect the following of the regex
by igelkott (Priest) on Mar 30, 2008 at 18:51 UTC

    If this was a more simple query, using a backreference (eg, \1) in your search would be a way to do this. See posts regarding Match Double Letters for examples.

    But, like my reply to your previous post, I'd still suggest using a Date module, unless this is just for your education.

Re: The first match affect the following of the regex
by poolpi (Hermit) on Mar 31, 2008 at 07:39 UTC