in reply to Re^6: How to split unique patterns
in thread How to split unique patterns

I'll comment on the regular expression:

# 1 2 3 + 4 5 6 $line=~ /^(\w+)::(gmdate):(20\d\d-[01]\d-[0123]\d [012]\d:[0-6]\ +d):(\w+):(\w*):(.*)/

The stuff parentheses fill in $1 to $6. This is commonly called "capturing parentheses", and documented in perlre.

The first pair of parentheses captures a sequence of characters (\w+), like info.

The second pair captures the literal string gmdate. We could have left the capture out, but maybe we want to expand the RE later to allow for other strings in that place.

The third pair captures something that looks like a YYYY-mm-dd HH:MM timestamp, with some basic validation thrown in:

  1. 20\d\d matches four digits that start with 20. For timestamps, this is sensible as it is unlikely that you will have to process timestamps from 1999, or timestamps in 2100.
  2. -[01]\d matches a minus followed by the digits 0 or 1, followed by another digit. This will capture something that vaguely looks like a month number, allowing numbers from 00 to 19. This is not exactly a month, but close enough. Especially this will break if somebody puts in a YYYY-dd-mm timestamp.
  3. -[0123]\d matches a minus followed by the digits 0,1,2 or 3. This will match the day part of the date. It makes no validation as to the months, so the 30th February or 31st April will still match.
  4. [012]\d:[0-6]\d will match something that vaguely looks like HH:MM, with the hour between 00 and 29 and the minute between 00 and 69. I allow for the 60 minutes because I mistook it for seconds, and depending on the exact specifics of UTC, you can have timestamps with 60 or 61 seconds I believe. In any case, it's better be lenient here.

If you can tell us where exactly you have problems with the regular expression, that will help us help you better.

Replies are listed 'Best First'.
Re^8: How to split unique patterns
by cornelius80 (Initiate) on Jun 11, 2013 at 02:29 UTC
    Hi Corion, Thank you so much for your patience on this. I really appreciate your explanation as it was very clear and precise. Kudos to you. Could I trouble you with just one more question,please. what does the following mean? @info{ @columns }= ($1,$2,$3,$4,$5,$6); Kind Regards, Cornelius

      The construct @info{ @columns } is a Hash Slice, as described in perldata.

      It basically allows mass-assignment to multiple keys of a hash in one statement.