Beefy Boxes and Bandwidth Generously Provided by pair Networks
Your skill will accomplish
what the force of many cannot
 
PerlMonks  

Pattern Match/Trim Variables.

by LostS (Friar)
on Feb 17, 2003 at 19:01 UTC ( [id://236073]=perlquestion: print w/replies, xml ) Need Help??

LostS has asked for the wisdom of the Perl Monks concerning the following question:

OK Here is what I am doing. I am doing a SELECT into a Database. It gets a few fields which are all in the same format (12/03/2001 12:00:01.000). What I need to do is to look at each field. If they fall in that format (MM/DD/YYYY HH:MM:SS.XXX) trim that .XXX off. How would you all suggest I do this? I know I need to do some sort of pattern match but due to I suck at regex (It is extremely hard for me to learn I am finding out) I am not sure how to go about this.... Here is my foreach statement:
foreach my $line (@rows) { $line =~ s/\t/\x09/ig; $line =~ s/\n/\x0A/ig; $line =~ s/\r/\x0D/ig; $line =~ s/"/'/ig; print FILE "\"$line\","; $cell++; }
Thank you all loads for any help...

-----------------------
Billy S.
Slinar Hardtail - Hand of Dane
Datal Ephialtes - Guildless
RallosZek.Net Admin/WebMaster

perl -le '$cat = "cat"; if ($cat =~ /\143\x61\x74/) { print "Its a cat +!\n"; } else { print "Thats a dog\n"; }'

Replies are listed 'Best First'.
Re: Pattern Match/Trim Variables.
by fruiture (Curate) on Feb 17, 2003 at 19:10 UTC

    "MM/DD/YYYY HH:MM:SS.XXX" and remove ".XXX" that's it. This directly translates to perl:

    s{^ ( #capture \d{2} #MM / \d{2} #DD / \d{4} #YYYY \s+ \d{2} #HH : \d{2} #MM : \d{2} #SS ) #stop capturing \. \d{3} #XXX $} {$1}x #replace by captured

    Have you read perlrequick and perlre?

    --
    http://fruiture.de
      OK after taking time and looking at what you said has greatly helped... I began to think about how the code works and looked at your information and it worked great. Thank you...

      -----------------------
      Billy S.
      Slinar Hardtail - Hand of Dane
      Datal Ephialtes - Guildless
      RallosZek.Net Admin/WebMaster

      perl -le '$cat = "cat"; if ($cat =~ /\143\x61\x74/) { print "Its a cat +!\n"; } else { print "Thats a dog\n"; }'
Re: Pattern Match/Trim Variables.
by dvergin (Monsignor) on Feb 17, 2003 at 20:01 UTC
    I like to keep my regexes simple.

    Strip a dot and some digits from the end of a string:

    my $str = '02/17/2003 11:44:19.123'; $str =~ s/\.\d+$//; print "$str\n";
    You say you are struggling with regexes, so here is the explanation:

    We could *try* to say s/.\d+$// but it happens that the dot is magic (matches most anything). So to match a dot we have to "escape" it with the backslash. Like this: \.

    Next comes \d which means "any digit". The plus after it means "one or more". Like this: \d+

    Then we have the $   When $ is used at the end of a regex, it means "anchor this match to the end of the string" (there are some nuances here we won't bother with). Actually, in this case we don't need the $ anchor -- we know that there is only one place in the string that will match "a dot followed by some digits". But it is perhaps a kindness to the next human who looks at the code to provide this visual clue that the match is expected to occur at the end of the string. So now we have: \.\d+$

    We put this regex snippet into a substitution regex: s/ / / But we leave the second half empty. This means "whatever you matched in the first half of the s///, replace it with nothing at all".

    So reading s/\.\d+$// straight off the page, we could translate it as: match a literal dot followed by some digits anchored to the end of the string and replace them with nothing.

    Hope that helps.

    ------------------------------------------------------------
    "Perl is a mess and that's good because the
    problem space is also a mess.
    " - Larry Wall

Re: Pattern Match/Trim Variables.
by enoch (Chaplain) on Feb 17, 2003 at 20:10 UTC
    You can replace those 4 substition with one transliteration.
    $line =~ tr/\t\n\r"/\x09\x0A\x0D'/;
    And, then, since the date format is fixed, just match up until the period.
    $line =~ s{ ( # start capturing into $1 [\d|/|\s|:]+ # match digits, # forward slashes, spaces, # or colons 1 or more times ) \. # stop capturing into $1 # when you hit a period \d\d\d} # match three more digits {$1}x; # replace it all w/ $1


    enoch

    edit: removed the ig options from the tr because they are not necessary (and not even valid) options.
Re: Pattern Match/Trim Variables.
by ihb (Deacon) on Feb 17, 2003 at 23:38 UTC

    Others have replied regarding your question, so I leave that. But I do want to suggest that the variable $line should be removed here. This is mostly a style question, but imho the code gets a lot nicer if $_ is used instead of another variable. The infamous $line is over-used, if you ask me.

    foreach (@rows) { s/\t/\x09/ig; s/\n/\x0A/ig; s/\r/\x0D/ig; s/"/'/ig; print FILE qq{"$_",}; # Other delimiters too. $cell++; }
    Other issues with this code are left aside.

    ihb
Re: Pattern Match/Trim Variables.
by steves (Curate) on Feb 17, 2003 at 22:09 UTC

    In defense of my pathetically ugly regexp, I offer two things:

    1. I was assuming, based on the original request, that some but not all dates needed fixing. Some of the other suggestions may not be concise enough to handle that.
    2. I sometimes (maybe too often) use that explicit \d\d type formatting to show what the pattern is. Anyway, \d{2} is 5 characters and \d\d is 4. 8-)
    But I'd probably take the first one that's nicely commented over mine.

Re: Pattern Match/Trim Variables.
by steves (Curate) on Feb 17, 2003 at 19:12 UTC

    I'm not sure what those first three substitutions are. You appear to be replacing characters with themselves, specifying the match as a meta-character and the replacement as a hex value.

    The s command you want to get rid of the trailing piece in the date is:

    my $date = '02/17/2003 14:09:34.087'; $date =~ s#^(\d\d/\d\d/\d\d\d\d \d\d:\d\d:\d\d)\.\d\d\d$#$1#;

      Don't worry about those other replace strings. I am creating a CSV and need those for formating... But how do I do an:
      if ($line =~ /\d\d\/\d\d\/\d\d\d\d \d\d:\d\d:\d\d.\d\d\d/) { $line =~ s#^(\d\d/\d\d/\d\d\d\d \d\d:\d\d:\d\d\.\d\d\d$#$1#; }
      Is that correct??

      -----------------------
      Billy S.
      Slinar Hardtail - Hand of Dane
      Datal Ephialtes - Guildless
      RallosZek.Net Admin/WebMaster

      perl -le '$cat = "cat"; if ($cat =~ /\143\x61\x74/) { print "Its a cat +!\n"; } else { print "Thats a dog\n"; }'

        You *could* do that, but the if() statement is completely unnecessary. If the match is not found in the string, then it won't do anything, thus the if() is redundant. As for that regex, I'd recommend you look further into the discussion and pick out one of the other (shorter!) ones.


        If the above content is missing any vital points or you feel that any of the information is misleading, incorrect or irrelevant, please feel free to downvote the post. At the same time, reply to this node or /msg me to tell me what is wrong with the post, so that I may update the node to the best of my ability. If you do not inform me as to why the post deserved a downvote, your vote does not have any significance and will be disregarded.

        Those first three substitutions are doing nothing unless there's some magic I'm missing. A tab has hex value 09, a newline hex value 0a, and a carriage return hex value 0d. The metacharacter representation and the hex representation in your substitutions evaluate to the same thing. So you're replacing each one with itself.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://236073]
Approved by Paladin
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others cooling their heels in the Monastery: (5)
As of 2024-04-18 19:42 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found