Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

The following value is assigned to a variable $stamp.
I'm trying to use regex to turn: Wed 08/22/2001 2:38p
into: 08_22_238p
So far,
$stamp =~ s/\//_/g; #changes / to _ $stamp =~ s/\s+//g; #strips out whitespace
turns it into: Wed08_22_2001_2:38p

I need to figure out how to strip out the first three characters: Wed (or Mon, Tue, Thr, Fri, Sat, Sun)
the "_2001" in the middle of the string (also needs to work when it becomes _2002, _2003, etc.)
and the ":" in the middle of the 2:38p.

Can anyone tell me how to do this?

Thanks in advance.

Replies are listed 'Best First'.
Re: How do I use regex to strip out specific characters in a string?
by blakem (Monsignor) on Aug 23, 2001 at 00:03 UTC
    Here is how I would do it:

    $stamp = 'Wed 08/22/2001 2:38p'; my ($month,$day,$year,$time) = ($stamp =~ m|(\d+)/(\d+)/(\d+)\s+([\d:] ++[pa])|); $time =~ s/://; my $newstamp = join('_',$month,$day,$year,$time); print "$stamp => $newstamp\n";

    Output:
    Wed 08/22/2001 2:38p => 08_22_2001_238p

    -Blake

Re: How do I use regex to strip out specific characters in a string? (boo)
by boo_radley (Parson) on Aug 23, 2001 at 00:37 UTC
    array slicing !
    while (<DATA>){print join "_",(split/[ \/]/,$_)[1,2,4]} __DATA__ Wed 08/22/2001 2:38p
    update Golf! Golf!
    the above is 40, inside brackets, with print

    update 22 : s/.{3,4} //g&&y|/|_|;
    update I wasn't removing the colon :( so, at 23 s/.{3,4} //g&&y|/:|_|d
    update as dutifully noted below, by danboo.

      Removed one.... s/....? //g&&tr|/:|_|d; Update
      And one more, s/....? //g&&tr|/:|_|d;
        20: s/\w+ //g&&y|/:|_|d;
      your code gets shorter, but it's not getting any closer to solving the problem. the ':' remains.
Re: How do I use regex to strip out specific characters in a string?
by Cine (Friar) on Aug 23, 2001 at 00:01 UTC
    $stamp =~ s/\w+\s+(\d+)\/(\d+)\/\d+\s(\d+).(.*)/$1_$2_$3$4/;


    Update:

    Removed my extra \ ;) ups

    T I M T O W T D I
      $stamp =~ s/\w+\s+(\d+)\/(\d+)\/\d+\s(\d+).(.*)\/$1_$2_$3$4/;

      That wont even compile... youve been tripped up by your leaning toothpicks ('/\/\/\/\'). You can avoid that by changing your regex delimiter... i.e.

      s|\w+\s+(\d+)/(\d+)/\d+\s(\d+).(.*)|$1_$2_$3$4|

      -Blake

Re: How do I use regex to strip out specific characters in a string?
by Cubes (Pilgrim) on Aug 23, 2001 at 00:15 UTC
    How about approaching from a slightly different angle: grab the bits of info you do want from the string, then use them to construct a new string.

    Do a regex match on your initial string, using parens and array context to get back the pieces you're interested in, like so:

    ($month, $day, $hour, $min, $ampm) = ($stamp =~ m|(\d+)/(\d+)/\d+\s+(\ +d+):(\d+)(\w)|);
    Then use those pieces to build your new string:
    $newstamp = sprintf("%02d_%02d_%02d%02d%s", $month, $day, $hour, $min, + $ampm);
    (Using %02d in sprintf will give you leading zeros in front of single digit numbers, which will make your resulting string more consistent -- change them to just %d to leave off the leading zeros)

    This has the added benefit of making the code's function much more obvious 6 months from now than a cryptic series of substitutions, and being easier to update if and when your formats change.

Re: How do I use regex to strip out specific characters in a string?
by runrig (Abbot) on Aug 23, 2001 at 00:21 UTC
    Nobody has used tr yet:
    substr($str,0,4) = ''; $str =~ tr/://d; $str =~ tr[/ ][__];

    Update: Hmm, thought I was wrong, but I was just mistaken :)

Re: How do I use regex to strip out specific characters in a string?
by danboo (Beadle) on Aug 23, 2001 at 00:41 UTC
    i'd handle this with the following:
    ($stamp = join '_', (split '[/ ]', $stamp)[1,2,4]) =~ tr/://d;
    separate it into multiple lines for clarity if desired.

    cheers,

    - danboo
Re: How do I use regex to strip out specific characters in a string?
by tshabet (Beadle) on Aug 23, 2001 at 00:06 UTC
    How about
    $stamp =~ s/.{3}(.{6}).{5}/$1/gosix; #strips first day/year chars $stamp =~ s/://gosix; #strips the colon
    Of course, I suck at regexes, so this only makes sense in my head...

      RE notes: A few things to take note of since it seems that you may include those qualifiers on all matches which are not always needed or wanted.

      s/://gosix

      • The o says to only compile the RE once which is the case anyway since the pattern is a literal.
      • the s folds lines in the match to one big line making . match \n which in this match a noop.
      • the i makes the match case insensitive which slows down the match. Additionally, the : is not affected by case.
      • the x allows comments to be included in the match none of which have been included.
      • as noted before tr/://d ( or y/://d ) is much more efficient to eliminate or change single characters.
        Whooops...thanks dga, I was smoking a little too much of the old crack this afternoon, had gimsox on the mind :-) Thanks for the pointers/clarification.
Re: How do I use regex to strip out specific characters in a string?
by Cine (Friar) on Aug 23, 2001 at 00:08 UTC
    The above does it for you, but to answer your real questions: You can either use a regex or substr:
    $stamp =~ /^.{3}(.*)/$1/; $stamp = substr($stamp,3); #<---- this is much faster
    The : is done like your \ but just replace it with nothing.
    the _20xx_ problem is: $stamp =~ /_\d{4}_//;
    You can handle the \ space and colon with a single command: tr/\/ :/_/d

    T I M T O W T D I
Re: How do I use regex to strip out specific characters in a string?
by dragonchild (Archbishop) on Aug 23, 2001 at 00:08 UTC
    I wouldn't use regexes for this, in large part because they're harder for people to understand, even if well commented. Instead, I'd do something like:
    my ($day_of_week, $date, $time) = split ' ', $stamp; my ($month, $day_of_month, $year) = split '/', $date; $time =~ s/://g; my $munged_stamp = "${month}_${day_of_month}_${time}";

    ------
    /me wants to be the brightest bulb in the chandelier!

    Vote paco for President!

      Doing it this way causes the creation of 7 temporary variables.

      You also should use a tr when you are only trying to affect a single character:

      tr/://d; # instead of s/://g;
      runrig below gave you a very elegant solution using substr and tr
        I agree with the tr/// instead of the s///.

        But, the creation of temp variables shouldn't be an issue. But, if you're that concerned with the creation of temporary variables, just undef the variables after you're done. You could even combine the variables as such:

        { my @temp_variables; push @temp_variables, split ' ', $stamp; push @temp_variables, split '/', $temp_variable[1]; $temp_variables[2] =~ tr/://d; $stamp = join '_', @temp_variables[3,4,2]; undef @temp_variables; }

        ------
        /me wants to be the brightest bulb in the chandelier!

        Vote paco for President!