noggon has asked for the wisdom of the Perl Monks concerning the following question:

Hi,

I am looking for some wisdom about a regular expression:

($ref !~ /^-?\d*(\.0*)?$/)

Firstly I think that this identifies $ref as an integer (i.e. ref is not equal to a non-integer), that being the case I want to adapt this to identify $ref as a negative integer.

I tried this:

($ref !~ /^-[1-9]\d*$/)

but I was wrong...

Help appreciated.

Scott

Replies are listed 'Best First'.
Re: positive integer identification (regular expression question)
by clscott (Friar) on Feb 06, 2003 at 20:45 UTC

    Do want to see if it is an integer or if it's a string that looks like an integer?

    Since you've anchored the regex at the begging and at the end I'm going to assume that you're checking that the value is a positive integer.

    Check if the value is a Positive Integer:
    if ( $ref == int($ref) and $ref eq int($ref) and $ref > 0 ) { print "$ref is a positive int!\n" }

    I'll leave the negative test as an exercise for the reader.

    update: added an additional test to prevent strings like 23EatAtJoes being considered a positive integer.

    This method fails and is unwieldy with integers larger than 999999999999999 on my system. I suggest Using one of the more robust methods suggested in the thread.

    --
    Clayton aka "Tex"
Re: positive integer identification (regular expression question)
by PodMaster (Abbot) on Feb 07, 2003 at 06:03 UTC

    Regexp::Common


    MJD says you can't just make shit up and expect the computer to know what you mean, retardo!
    ** The Third rule of perl club is a statement of fact: pod is sexy.

Re: positive integer identification (regular expression question)
by Enlil (Parson) on Feb 06, 2003 at 21:24 UTC
    What this:
    /^-?\d*(\.0*)?$/
    matches is anything that is an integer (incorrectly matches a blank string though). Whereas this:
    /^-?\d+(\.0*)?$/
    matches an integer
    • At the beginning of the variable there might be a - (the ? makes it optional). (the ^ fixes it to the start of the string)
    • The \d+ makes sure that there is one or more digits (either after the - if it is there or at the beginning of the string). The way it was before it would have matched the blank string as an integer.
    • next (\.0*)? lets the pattern of a . followed by zero or more 0 be included in a match as 1.0000 is still considered an integer. Though this whole thing is optional because of the ?
    • lastly the $ insures that it must match the end of the string after the .000 pattern if it is there, or if not that the last thing in the string pattern matched by \d+.

    but in answer to your query all you would have to do to check for just negative integers is make it so the - is not optional in the regular expression.

    As always, perlretut and perlre are good reads

    -enlil

      Thank you all - it has been an enlightening and humble experience being in your company and reading your wisdom.

      Thank you.

      Scott

      Is there any reason why capturing parentheses are used around \.0*

      Also the ? does not make the preceeding pattern optional. As per the Camel book p. 39 "you can force nongreedy, minimal matching by placing a question mark after any quantifier". A ? not immediately preceeded by a quantifier thus has no sense.

      In order to make the preceeding pattern optional use the *-quantifier, which means zero or more times.

      CountZero

      "If you have four groups working on a compiler, you'll get a 4-pass compiler." - Conway's Law

        If I do this (\.0*)*, then the pattern (\.0*) will match 0 or more times. Which would mean that .0 would match but so would .0.0.0 and .000.00 . These latter matches are not part of what is considered to be at the end of an integer. The ? on the other hand means whatever pattern is before me try to match it 1 or 0 times (in this instance making it optional). Unless ? is used afer a quantifier (e.g. +, or ?, or {min,max}). I could alternatively have used (\.0*){0,1} in this instance.

        on the other hand if I were to have ?? then it would try to match 0 or 1 time (in that order where as before (i.e, the single ? )it would try to match it 1 time and failing that it would ignore it was there, making it optional). I understand I might be unclear but I can further clarify if needed. I know this explained in the Mastering Regular Expressions book, but I have not quite yet located it in the Camel book (i will update when/if I find it)

        update: As for you question though there is no reason they are capturing parens it could just as well be (?:\.0*)? so they don't have to be capturing parens but I did not figured it mattered in this case, as I am lazy I dont normally put the ?: in the parens to make them non-capturing, unless I care what goes into the $1,$2,... series of variables, that are returned by the capturing parens.

        As for it location in in the Camel book (3rd edition) refer to page 142 at the paragraph starting with "Another thing you'll see are what we call quantifiers, ..."

        -enlil

Re: positive integer identification (regular expression question)
by CukiMnstr (Deacon) on Feb 06, 2003 at 21:21 UTC
    I want to adapt this to identify $ref as a negative integer.

    then you want:

    ($ref =~ /^-[1-9]\d*$/)
    (i.e. change !~ to =~)

    hope this helps,