sulfericacid has asked for the wisdom of the Perl Monks concerning the following question:

I nearly came close to writing another entire script all by myself without help!! yay me!! I'm actually getting more fluent with perl syntax and being able to debug and understand my own coding but I still have difficulty understanding new things until people tell me what it is and why I need it.

I was testing a variable against a regex so it had to contain nothing but numbers and a single decimal point. I had no clue how to do this so someone told me to use while($interest !~ /^-?\.?\d+(?:\.\d+)?$/) { Can someone please explain to me what each of those things are doing? This totally confuses me.

Thanks everyone!

"Age is nothing more than an inaccurate number bestowed upon us at birth as just another means for others to judge and classify us"

sulfericacid

Replies are listed 'Best First'.
Re: What does this regex do?
by jasonk (Parson) on Apr 01, 2003 at 19:26 UTC
    m/ ^ # anchor to the beginning of the string -? # 0 or 1 dashes (for negative numbers I assume) \.? # 0 or 1 decimal points \d+ # 1 or more numbers (?:\.\d+)? # 0 or 1 decimal places followed by a number $ # anchor to the end of the string /x

    So basically what you have is a a possibly negative, possibly decimal number (-?\.\d+), followed by an optional decimal and number.

    # some examples /^-?\.?\d+/ matches 123 or -123 or -.123 or .123 /^-?\.?\d+(?:\.\d_)?/ matches those and also 123.45 or -123.45 or -.123.45 (probably a bug!) or -123.23532 or .235235235235.235235235235235325

    The (?:...) construct lets you use parentheses for grouping, without assigning them to the $n variables, so your regexp contains parens that keep the \. and \d+ together, but doesn't assign the match to $1.


    We're not surrounded, we're in a target-rich environment!
Re: What does this regex do?
by grantm (Parson) on Apr 01, 2003 at 21:34 UTC

    You may find japhy's YAPE::Regex::Explain module to be useful in answering these types of question. You'll need to install YAPE::Regex first. This is waht it had to say about your example regex:

    #!/usr/bin/perl -w use strict; use YAPE::Regex::Explain; my $exp = YAPE::Regex::Explain->new('^-?\.?\d+(?:\.\d+)?$'); print $exp->explain; The regular expression: (?-imsx:^-?\.?\d+(?:\.\d+)?$) matches as follows: NODE EXPLANATION ---------------------------------------------------------------------- (?-imsx: group, but do not capture (case-sensitive) (with ^ and $ matching normally) (with . not matching \n) (matching whitespace and # normally): ---------------------------------------------------------------------- ^ the beginning of the string ---------------------------------------------------------------------- -? '-' (optional (matching the most amount possible)) ---------------------------------------------------------------------- \.? '.' (optional (matching the most amount possible)) ---------------------------------------------------------------------- \d+ digits (0-9) (1 or more times (matching the most amount possible)) ---------------------------------------------------------------------- (?: group, but do not capture (optional (matching the most amount possible)): ---------------------------------------------------------------------- \. '.' ---------------------------------------------------------------------- \d+ digits (0-9) (1 or more times (matching the most amount possible)) ---------------------------------------------------------------------- )? end of grouping ---------------------------------------------------------------------- $ before an optional \n, and the end of the string ---------------------------------------------------------------------- ) end of grouping ----------------------------------------------------------------------

    Which may or may not help :-). Note that it acts as if the regex you supply had (?-imsx: ... ) wrapped around it, which could be a bit confusing.

Re: What does this regex do?
by MrYoya (Monk) on Apr 01, 2003 at 19:25 UTC
    /^-?\.?\d+(?:\.\d+)?$/
    It means match if there's a - in front or not, followed by an optional . (period), then one or more digits, then optionally a . (period) followed by one or more digits. Since it's anchored at the front and back, that's all it can match. So it matches numbers like
    -.5 or 5.5522 or -62.0
    Since it's using the !~, it'll match anything but that.
Re: What does this regex do?
by Improv (Pilgrim) on Apr 01, 2003 at 19:31 UTC
    Ok, bit by bit,
    $interest is the thing you're applying the regex to.
    !~ is the "does not match" operator.
    The // enclose the regex
    ^ means to match beginning at the start of $interest
    -? means to optionally match a minus character
    \.? means to optionally match a literal period
    \d+ means to match one or more digits
    (?:    ) does grouping in a noncapturing way
    \. matches a literal period
    \d+ matches one or more digits
    The ? after that () means to optionally match that
    grouping.
    $ means the line must end there.
    
    I don't quite know why someone told you to wrap that
    in a while loop -- it doesn't really make sense, unless
    perhaps the contents of the while loop might explain it.
    In sum though, reading through my description there, you
    see that a number is, for the purposes of this regex,
    
    optionally starts with a minus sign, and maybe a decimal
    point, then a sequence of digits, and possibly a decimal
    point plus a fractional part.
    
    Hope this helps..
    
Re: What does this regex do?
by dpuu (Chaplain) on Apr 01, 2003 at 19:37 UTC
    OK. I assume that you understand the while loop part of it, so just the regex:
    $interest !~ / # match against $interest (negated) ^ # start of line -? # optional minus sign \.? # optional decimal point \d+ # 1 or more digits (?: # start non-capturing group (i.e. not $1) \. # decimal point \d+ # 1 or more digits )? # end of group -- match 0 or 1 times $ # end of line /x; # end of regex (the /x modifier allows # comments in the regex -- can be useful)
    So thats what the regex does. But it doesn't look quite right for its stated purpose -- for example, your numbers could have two decimal points. --Dave
      One more thing: you might want to look at Damian Conway's Regexp::Common package: it contains working regexes for many common situations (such as matching numbers). --Dave
Re: What does this regex do?
by cbro (Pilgrim) on Apr 01, 2003 at 19:50 UTC
    I'm about to head out, unfortunately, so I don't have time for a long write-up, but check this out (it really helped me when I was learning regexp):
    Click
    It's an outline of the regexp syntax with explanations and examples.
    I just hope you find it as useful as I did.
Re: What does this regex do?
by sulfericacid (Deacon) on Apr 01, 2003 at 19:44 UTC
    Thanks for your help everyone, this line looked really confusing at first. I was beginning to wonder what ? meant but now I know it means it's optional, which makes sense.

    So let me get this straight. The part before the (? matches anything that begins with a dot, a digit or more and a -? If so, what does everything inside (?: ) do? Only thing I can think of is before it hits that group it will ONLY match a dot, a - or a set of digits and the group itself allows you to use any combination of them (ie -1.094).

    Am I anywhere close? One other thing, you said it allows two decimal points, how can I only get it to accept one?

    "Age is nothing more than an inaccurate number bestowed upon us at birth as just another means for others to judge and classify us"

    sulfericacid

      As a general rule, a good way to build complex regexes is to be long-winded. Don't optimize. List each case in full:
      $interest !~ ( ^ # start of line -? # optionally negative (?: \d+ # numbers with no decimal point | \d+\.\d+ # numbers with decimal point in middle | \.\d+ # numbers with decimal point at start | \d+\. # numbers with decimal point at end ) $ # end of line /x;
      From this starting point, you can then attempt to optimize some the cases, if you want. But there's no need to: perl will optimize them for you. --Dave