in reply to more date conversion happiness, part 3

Your regexes have a problem. If you have something like $str =~ /\d/, that will match the first digit in $str. When you are validating or parsing, you usually want to provide anchors: zero-width assertions about what characteristics the string must have at that point in the regex. For instance, to parse a string with 0 or more digits followed by one or more letters, you would say:
$str =~ /^\d*[a-z]+\z/i;
where the ^ says that part of the regex can only match at the beginning of the string and \z says that can match only at the end of the string, so that no unexpected characters are allowed before or after the pattern specified.

Without the anchors, you get /\d*[a-z]+/i which will match any string that has at least one letter somewhere in it, e.g. ";!$#a-+".

(You will often see $ used instead of \z; that will match either at the end of the string or immediately before a newline character at the end of the string; sometimes handy when dealing with unchomped input, but usually not what is actually wanted.)

The meaning of ^ and $ changes when the //m flag is used, see perlre for details.

Replies are listed 'Best First'.
Re: Re: more date conversion happiness, part 3
by ctp (Beadle) on Jan 12, 2004 at 03:24 UTC
    Thanks. Funny you mention the anchors...the first couple versions of the script had all the anchors in place, but I got several replies that said I didn't need them, or should take them out, so I did. In the case of this script (as far as the input it was written to handle), they appear to work either way, but indeed with different input they might not.
      It really depends on whether your task is to take specified input and validate and parse it, or look for any date-like thing in the input. Without the anchors, m:\d{2}/\d{2}/\d{2}: will quite happily match the "04/08/84" in "123/3/104/08/840/2".
        Indeed...I did some experimenting with that. This particular assignment had a very specific input it had to deal with. But I'm working on something right now that will use the anchors for exactly what you mentioned. Thanks!