Quicksilver has asked for the wisdom of the Perl Monks concerning the following question:

I crave some benevolence from the monks. I'm tying to teach myself some basic text parsing. I'm trying to test for if a part of a month exists in a string, having split the string into tokens and then the tokens into substring of 3 characters long. If it does exist, then I'm trying to swap it to a number using a hash. I keep getting a use of an uninitialised value in the concatenation.
sub num_month { my $date = shift; my %months = (JAN => '01', FEB => '02', MAR => '03', APR => '04', MAY=> '05', JUN => '06', JUL => '07', AUG => '08', SEP => '09', OCT => '10', NOV => '11', DEC => '12'); my @tokens = split (" ", $date); my $date; foreach my $num (@tokens) { my $strBegin = substr($num, 0, 3); my $date =~ s/$strBegin/$months{uc $strBegin}/; } return $date; }
As this pattern may or may not exist in the line, I just wanted to effect a light swap using a hash rather than trying to call Date::parse since there is no standard pattern.

Replies are listed 'Best First'.
Re: A quick date swap from a string
by toolic (Bishop) on Jul 08, 2009 at 20:44 UTC
    Do you use warnings;? When I add that to your code, it shows that you use my $date twice in the same context.

    Update: Showing how you call your function would be helpful, too.

Re: A quick date swap from a string
by zwon (Abbot) on Jul 08, 2009 at 20:46 UTC
    Perhaps this is what you want:
    foreach my $num (@tokens) { my $strBegin = uc substr($num, 0, 3); $date = $months{$strBegin} if exists $months{$strBegin}; }

    I don't know what data you're processing, but this method doesn't look reliable.

Re: A quick date swap from a string
by jrsimmon (Hermit) on Jul 08, 2009 at 21:05 UTC
    You're use of $date is unlikely to be what you intended. Assuming the second my $date; isn't supposed to be there, we're left with:
    sub num_month{ my $date = shift; ... foreach ...{ my $date = ... # <----this is a new $date scalar, which does not +exist outside of this foreach loop } return $date; #<----this is the $date from before the foreach loop +-- ie, exactly what was passed to the sub }

    You can see the behavior by running this short script:
    use strict; use warnings; my $test = &GO(1); print $test; sub GO{ my $val = shift; foreach (0..10){ my $val += $_; print $val . "\n"; } return $val; } 0 1 2 3 4 5 6 7 8 9 10 1
Re: A quick date swap from a string
by Marshall (Canon) on Jul 09, 2009 at 05:03 UTC
    This question appears a bit strange as normally one would know what the position the "month" is in the string. If that is the case, then no "foreach" would be required. But going with the problem statement...., I would add keys to the month hash with the full month rather than getting a substr of the first 3 letters. Also usually splitting on /\s+/ (the default split) will work out better than splitting on " " because there can be tabs or multiple spaces in the input that are hard to see. As another point, if this line comes directly from input, then splitting on /\s+/ eliminates the need for "chomp;" (\n is a space character).
    sub num_month { my $date_line = shift; #like jan 12 2009 or january 12 2009 my %months = (JAN => '01', FEB => '02', MAR => '03', APR => '04', MAY=> '05', JUN => '06', JUL => '07', AUG => '08', SEP => '09', OCT => '10', NOV => '11', DEC => '12', JANUARY => '01, ....etc.....); my (@tokens) = split (/\s+/, uc($date_line)); foreach my $token (@tokens) { return ($months{$token} if $months{$token}); } die "no month in @tokens"; ## optional but you may want this }
    Note: if you omit the \n in the die "text", Perl will add the module and line number of the death to the output.

    Update:

    showing how slit on \s and , can be done and how to get say the 2nd thing in a split via a list slice without using foreach():

    #!/usr/bin/perl -w use strict; my $date_line = "Jan 2 , 2009 \n"; my (@tokens) = split (/[\s,]+/, uc($date_line)); print "@tokens\n"; #prints JAN 2 2009 my $date_line2 = "2 jan , 2009 \n"; #to get 2nd thing in split, use a list slice... my $month_text = (split (/[\s,]+/, uc($date_line2)))[1]; print "$month_text\n"; #prints JAN
Re: A quick date swap from a string
by mattford63 (Sexton) on Jul 08, 2009 at 22:01 UTC
    Your variable scope looks wrong to me. You declare $date at the top of the routine, then try and declare it again a little bit later (in the same context). Finally you declare a 3rd time in the foreach routine.

    Drop the second "my $date;", and remove the "my" keyword on the substitution line in the foreach loop.

      Change the substitute to:
           $date =~ s/$strBegin/$months{uc $strBegin}/e;
      
      The function uc() will not be executed otherwise.
Re: A quick date swap from a string
by ashish.kvarma (Monk) on Jul 09, 2009 at 09:00 UTC
    Not sure if the following code helps you but this is how I would have done it if I was not sure about the location of month in the string.
    my $mons = { JAN => '01', FEB => '02', MAR => '03', APR => '04', MAY => '05', JUN => '06', JUL => '07', AUG => '08', SEP => '09', OCT => '10', NOV => '11', DEC => '12', JANUARY => '01', FEBRUARY => '02', MARCH => '03', APRIL => '04', MAY => '05', JUNE => '06', JULY => '07', AUGUST => '08', SEPTEMBER => '09', OCTOBER => '10', NOVEMBER => '11', DECEMBER => '12', }; my @dates = ( '23 March 2009', '21st May, 2009', '2009, Jun 30', 'Jul 09, 2009 at 05:03 UTC', 'JAN 2 2009', 'JAN 2 2009', '2 jan , 2009', '2009, 02-jan', ); my @new_dates; printf "[%25s]\t[%25s]\n\n", "orignal date", "new date"; foreach my $date (@dates) { printf "[%25s]\t", $date; # I have considered '\s,;:-' as separators modify the regex # if you have more.(ex add '/' for date format '08/Jan/2009') if (my @matches = grep {$date =~ /(?:[ ,;:-]|^)$_(?:[ ,;:-]|$)/i} +keys %$mons) { # Line should match only one month, but just in case. foreach my $match (@matches) { $date =~ s/$match/$mons->{$match}/gi; } } printf "[%25s]\n", $date; push @new_dates, $date; } # OUTPUT # [ orignal date] [ new date] # # [ 23 March 2009] [ 23 03 2009] # [ 21st May, 2009] [ 21st 05, 2009] # [ 2009, Jun 30] [ 2009, 06 30] # [Jul 09, 2009 at 05:03 UTC] [ 07 09, 2009 at 05:03 UTC] # [ JAN 2 2009] [ 01 2 2009] # [ JAN 2 2009] [ 01 2 2009] # [ 2 jan , 2009] [ 2 01 , 2009] # [ 2009, 02-jan] [ 2009, 02-01]
    Though, I would say that solution by 'poolpi' looks short and nice.
Re: A quick date swap from a string
by poolpi (Hermit) on Jul 09, 2009 at 08:15 UTC
    #!/usr/bin/perl use strict; use warnings; my @month_name = qw/ january february march april may june july august september /; my $month = {}; for ( 0 .. $#month_name ) { $month_name[$_] =~ s/(\w{3})(\w*)/$1(?:$2)?/; $month->{qr/$month_name[$_]/} = $_ + 1; } sub month_to_num { my $date = shift; for ( keys %$month ) { last if $date =~ s/$_/$month->{$_}/; } return $date; } printf "%02s %02d %4d\n", split /\s+/, month_to_num($_) for ( 'dec 7 2008', 'may 25 2003', 'march 01 1897' ); # Output: # dec 07 2008 # 05 25 2003 # 03 01 1897


    hth,
    PooLpi

    'Ebry haffa hoe hab im tik a bush'. Jamaican proverb
Re: A quick date swap from a string
by ggoebel (Sexton) on Jul 10, 2009 at 18:21 UTC
    You declare "my $date" twice:
      my $date= shift;
      ...
      my $date;
    
    The second one creates a new lexical with an undefined value. So the subsequent substitution and return are working with undef;