in reply to Re: Validating Numbers in CGI Script?
in thread Validating Numbers in CGI Script?

I know it's in the FAQ, but I will construct here on my own, for the good of all.
/^(?:-)?(?:\d+(?:\.\d+)?|\.\d+)$/ # becomes /^-?(?:\d+(?:\.\d+)?|\.\d+)$/ # UPDATE! /^-?(?=\d|\.\d)\d*\.?\d+$/
The look-ahead is used to remove the redundancy in your regex. It basically ensures there's either a digit, or a dot and THEN a digit. Then, it matches digits and/or a dot and digits, which we know will be valid.

japhy -- Perl and Regex Hacker

Replies are listed 'Best First'.
Re: Re: Re: Validating Numbers in CGI Script?
by dha (Acolyte) on May 11, 2001 at 03:08 UTC
    Of course, if you're going to use a regex(p), you might as well make it easy on yourself:

    use Regexp::Common;

    if ($thing =~ /^$RE{num}{dec}$/) {
        ...
    }

    (this particular example, of course, requires DWIM.pm... :-)

    dha

      I'm in favor of the user knowing what the regex does, before using some pre-packaged regex. No offense to Damian, of course, but I'd rather know what my regex does exactly. That said, some interested party could read the module's source.

      japhy -- Perl and Regex Hacker
Re (tilly) 2: Validating Numbers in CGI Script?
by tilly (Archbishop) on May 11, 2001 at 05:56 UTC
    I can only refer you to the article that MeowChow mentioned. Taking a look at Tom Christiansen's list of benefits that someone who rolls their own RE usually misses, you get zero out of 3 of them.

    RE's are not always the right answer...

    UPDATE
    Changed the link because the page moved. (Thanks footpad.)

Re: Re: Re: Validating Numbers in CGI Script?
by larryk (Friar) on May 10, 2001 at 19:27 UTC
    cheers for that - don't know what I was thinking with the (?:-)? bit - maybe the smiley just cheered us all up!

    correct me if i'm wrong but won't your regex pick up a number which finishes a sentence _and_ the full stop (period to americans)? this would lead to perl treating it as a string in some cases (eg. ++ or --).

      Well, let's take this guy apart. We know how the leading -? will be treated, so let's leave it out. Then we get two paths that this regex can follow:

      (?=\d)\d*\.?\d+$

      and

      (?=\.\d)\d*\.?\d+$

      If the regex takes the first path, we're guaranteed to get at least one digit. This digit must be present in the \d* or \d+ if it goes to the \d*, then we're guaranteed to match the equivalent of

      \d+\.?\d+

      which will match anything that's comprised of just numbers, and any two numbers with a single period between them.

      if it's in the \d+, then we know that there are no periods in the string, since we must have passed through the optional \d* and \.?

      so this first path will always match a number number with no fractional portion, or a fractional number with a mandatory leading whole number portion

      Now in the other path: (?=\.\d)\d*\.?\d+$

      If this regex matches, we're guaranteed to have a period then a number. Since this is a lookahead, that means we've seen past the following \d*, since we can't possibly have picked up a number prior to the period.

      So the actual regex in this path is \.\d+, which clearly just matches a number with no preceding 0.

      So in any case, I don't think it will match anything other than numbers, as japhy suggests.

      Hmm, actually, I think I meant to make the regex be /^-?(?=\d|\.\d)\d*\.?\d+$/. Yes, that looks far more sane. The other could have matched "1234.", which isn't proper. This one can match "1234", ".1234", and "12.34".

      japhy -- Perl and Regex Hacker
        I hate to say this, but sometimes I actually need to use numbers like "300." This is sometimes used to denote the difference between 3*10^2 and 3.00*10^2. The latter has three significant digits, while the former has only one.

        But don't fix your regex on my account; I'm sure that my situation is in the minority...

        buckaduck