Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

Hi

I cant for the life of me work out how to do this as a reg-ex even though it should be simple! :o(

I have a series of numbers which can be positive or negative with 1-3 digits following and then 1 decimal place.

Where I'm getting stuck is how to say "it MAY have a - first but maybe not and how to say "1 to 3 digits before the decimal point.

So far I had:

if ($_ =~ /^((-)d{1-3}\.\d)/) {

... but its not working :o( Where am I going wrong?

Replies are listed 'Best First'.
Re: Regex woes...
by adamsj (Hermit) on Jul 09, 2001 at 03:33 UTC
    Try this:

    /^-?\d{1,3}\.\d/

    Update: I suppose it'd only be right to explain it.

    You know what the ^ does, and the -? says "Match that minus sign if it's there once or not at all."

    Note that, in your code, you've got \d{1-3} where you want \d{1,3}.

    adamsj

    They laughed at Joan of Arc, but she went right ahead and built it. --Gracie Allen

      I would refine the regex a bit. A number that begins with a zero to the left of the decimal point must be followed by the decimal point. So, even though the questioner does not specify this requirement, I would say:

      (/^-?\d{1,3}\.\d/) && !(/^-?0\d+/)

      I also point out for the questioner's benefit that $_ =~ /myregex/ is redundant. By default a regex operates on $_ when there is no explicit operand.

      And, just for fun, let's allow the number to begin with a '+' or a '-'. Hence:

      (/^[-+]?\d{1,3}\.\d/) && !(/^[-+]?0\d+/)
Re: Regex woes...
by wog (Curate) on Jul 09, 2001 at 03:35 UTC
    You regexp is correct except for the (-) part. The (-) requires there to be a - there and caputures it into $2. In this case the parens aren't really necessary. However, you probably want a ? after the -. The ? after something means that it may occur zero or one time.

    That said, you might want to replace the (-) not with -? but with [-+]?. [-+] creates a character class matching - or +, so it would allow numbers like +157.0.

    I think you probably want a $ or \z at the end of your regex to say that the end must be either the end of the string or a newline before it (for $), or the end of the string, no exceptions, (for \z.)

    perlre is your friend, as always.

    update: As adamsj pointed out you also used - instead of , with \d{1-3}. I read too fast to notice.

      And that raises an interesting point, which probably has a trivial answer which I don't know. Why doesn't a construct like \d{1-3} make -w scream?

      Even more oddly, if it's reversed to say \d{3-1}, it neither makes -w scream nor does it DWIM and match, say, -12.3, like it would if it were \d{2}.

      adamsj

      They laughed at Joan of Arc, but she went right ahead and built it. --Gracie Allen

        -Mre=debug reveals the answer:

        ... Compiling REx `\d{3-1}' ... 1: DIGIT(2) 2: EXACT <{3-1}>(5) 5: END(0) ...

        Apparently this means the regex got compiled to match a digit, and then the literal text {3-1}. It would be nice if perl recognized that there might have been confusion with the {M,N} syntax, and thus would throw a warning, though issues of backwards compatiblity would need to be considered in doing this.

        If you look at regcomp.c, you'll see that when the regx parser finds an open brace, it looks for digits, and optionally followed by a comma and another optional string of digits. Anything else falls through as plaintext. (See the S_regpiece() function.)

        japhy -- Perl and Regex Hacker