Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

But obviously I dont. Can any one tell me why $year is empty ? Thanks
$date="201004061749" DB<153> p $date 201004061749 DB<154> ($year,$mon,$day,$hour,$min)=split(/(\d{4})(\d{2})(\d{2})(\d +{2})(\d{2})/,$date) DB<155> p $year DB<156> p $mon 2010 DB<157> p $day 04 DB<158> p $hour 06 DB<159> p $min 17

Replies are listed 'Best First'.
Re: I thought I understood split...
by kennethk (Abbot) on Apr 06, 2010 at 17:07 UTC
    I think you are getting confused between split and capturing from regular expressions. For example, the code:

    ($year,$mon,$day,$hour,$min)= $date =~ /(\d{4})(\d{2})(\d{2})(\d{2})(\d{2})/;

    does what I think you intend. Your issue with split comes from:

    If the PATTERN contains parentheses, additional list elements are created from each matching substring in the delimiter.

    Your split is actually returning ("", 2010, "04", "06", 17, 49) because the split term is your expression, similar to @array = split /x/, "x";

Re: I thought I understood split...
by ikegami (Patriarch) on Apr 06, 2010 at 17:54 UTC

    The purpose of split is to parse a separated list into its elements. There's no separator between the fields on which to split, making split the wrong choice. Use the match operator (m//) instead.

    my ($year,$mon,$day,$hour,$min) = $date =~ /^(\d{4})(\d{2})(\d{2})(\d{2})(\d{2})\z/;

    If you're not trying to validate, you can use unpack a bit more cleanly here:

    my ($year,$mon,$day,$hour,$min) = unpack('A4 A2 A2 A2 A2', $date);
Re: I thought I understood split...
by ambrus (Abbot) on Apr 06, 2010 at 18:07 UTC

    In addition to what others say, you could use Date::Manip to parse the date from any format (including this one) and extract the parts:

    use Date::Manip; $date = "201004061749"; ($year, $mon, $day, $hour, $min) = UnixDate($date, qw"%Y %m %d %H %M") +;
Re: I thought I understood split...
by roboticus (Chancellor) on Apr 06, 2010 at 17:16 UTC

    It's rather simple: split uses the pattern you supply as a separator. So it split the list into two items "" and "" separated by "201004061749". Then (as described in the split documentation), since you have parenthesis in your pattern, split inserts the captured groups into the result list. So it's returning ("", "2010", "04", "06", "17", "49", "").

    I'm guessing you really wanted to use the match operator and assign the list of captured values to your list of variables.

    UPDATE: I just tried it, and was mistaken, there's no "" at the end of the list that split returns:

    [13:19:56] /DOS/c/Program Files/Bank of America/EFS $ perl -e '$_="201004061749";print "<",join(".",split/(\d\d\d\d)(\d\d) +(\d\d)(\d\d)(\d\d)/),">"' <.2010.04.06.17.49>