Re: What does this regex do?
by jasonk (Parson) on Apr 01, 2003 at 19:26 UTC
|
m/
^ # anchor to the beginning of the string
-? # 0 or 1 dashes (for negative numbers I assume)
\.? # 0 or 1 decimal points
\d+ # 1 or more numbers
(?:\.\d+)? # 0 or 1 decimal places followed by a number
$ # anchor to the end of the string
/x
So basically what you have is a a possibly negative, possibly decimal number (-?\.\d+), followed by an optional decimal and number.
# some examples
/^-?\.?\d+/ matches
123 or
-123 or
-.123 or
.123
/^-?\.?\d+(?:\.\d_)?/ matches those and also
123.45 or
-123.45 or
-.123.45 (probably a bug!) or
-123.23532 or
.235235235235.235235235235235325
The (?:...) construct lets you use parentheses for grouping, without assigning them to the $n variables, so your regexp contains parens that keep the \. and \d+ together, but doesn't assign the match to $1.
| We're not surrounded, we're in a target-rich environment! |
|---|
| [reply] [d/l] [select] |
Re: What does this regex do?
by grantm (Parson) on Apr 01, 2003 at 21:34 UTC
|
You may find japhy's YAPE::Regex::Explain module to be useful in answering these types of question. You'll need to install YAPE::Regex first. This is waht it had to say about your example regex:
#!/usr/bin/perl -w
use strict;
use YAPE::Regex::Explain;
my $exp = YAPE::Regex::Explain->new('^-?\.?\d+(?:\.\d+)?$');
print $exp->explain;
The regular expression:
(?-imsx:^-?\.?\d+(?:\.\d+)?$)
matches as follows:
NODE EXPLANATION
----------------------------------------------------------------------
(?-imsx: group, but do not capture (case-sensitive)
(with ^ and $ matching normally) (with . not
matching \n) (matching whitespace and #
normally):
----------------------------------------------------------------------
^ the beginning of the string
----------------------------------------------------------------------
-? '-' (optional (matching the most amount
possible))
----------------------------------------------------------------------
\.? '.' (optional (matching the most amount
possible))
----------------------------------------------------------------------
\d+ digits (0-9) (1 or more times (matching
the most amount possible))
----------------------------------------------------------------------
(?: group, but do not capture (optional
(matching the most amount possible)):
----------------------------------------------------------------------
\. '.'
----------------------------------------------------------------------
\d+ digits (0-9) (1 or more times (matching
the most amount possible))
----------------------------------------------------------------------
)? end of grouping
----------------------------------------------------------------------
$ before an optional \n, and the end of the
string
----------------------------------------------------------------------
) end of grouping
----------------------------------------------------------------------
Which may or may not help :-). Note that it acts as if the regex you supply had (?-imsx: ... ) wrapped around it, which could be a bit confusing. | [reply] [d/l] [select] |
Re: What does this regex do?
by MrYoya (Monk) on Apr 01, 2003 at 19:25 UTC
|
/^-?\.?\d+(?:\.\d+)?$/
It means match if there's a - in front or not, followed by an optional . (period), then one or more digits, then optionally a . (period) followed by one or more digits. Since it's anchored at the front and back, that's all it can match. So it matches numbers like
-.5 or 5.5522 or -62.0
Since it's using the !~, it'll match anything but that. | [reply] [d/l] [select] |
Re: What does this regex do?
by Improv (Pilgrim) on Apr 01, 2003 at 19:31 UTC
|
$interest is the thing you're applying the regex to.
!~ is the "does not match" operator.
The // enclose the regex
^ means to match beginning at the start of $interest
-? means to optionally match a minus character
\.? means to optionally match a literal period
\d+ means to match one or more digits
(?: ) does grouping in a noncapturing way
\. matches a literal period
\d+ matches one or more digits
The ? after that () means to optionally match that
grouping.
$ means the line must end there.
I don't quite know why someone told you to wrap that
in a while loop -- it doesn't really make sense, unless
perhaps the contents of the while loop might explain it.
In sum though, reading through my description there, you
see that a number is, for the purposes of this regex,
optionally starts with a minus sign, and maybe a decimal
point, then a sequence of digits, and possibly a decimal
point plus a fractional part.
Hope this helps..
| [reply] |
Re: What does this regex do?
by dpuu (Chaplain) on Apr 01, 2003 at 19:37 UTC
|
OK. I assume that you understand the while loop part of it, so just the regex:
$interest !~ / # match against $interest (negated)
^ # start of line
-? # optional minus sign
\.? # optional decimal point
\d+ # 1 or more digits
(?: # start non-capturing group (i.e. not $1)
\. # decimal point
\d+ # 1 or more digits
)? # end of group -- match 0 or 1 times
$ # end of line
/x; # end of regex (the /x modifier allows
# comments in the regex -- can be useful)
So thats what the regex does. But it doesn't look quite right for its stated purpose -- for example, your numbers could have two decimal points.
--Dave | [reply] [d/l] |
|
|
One more thing: you might want to look at Damian Conway's Regexp::Common package: it contains working regexes for many common situations (such as matching numbers).
--Dave
| [reply] |
Re: What does this regex do?
by cbro (Pilgrim) on Apr 01, 2003 at 19:50 UTC
|
I'm about to head out, unfortunately, so I don't have time for a long write-up, but check this out (it really helped me when I was learning regexp):
Click
It's an outline of the regexp syntax with explanations and examples.
I just hope you find it as useful as I did.
| [reply] |
Re: What does this regex do?
by sulfericacid (Deacon) on Apr 01, 2003 at 19:44 UTC
|
Thanks for your help everyone, this line looked really confusing at first. I was beginning to wonder what ? meant but now I know it means it's optional, which makes sense.
So let me get this straight. The part before the (? matches anything that begins with a dot, a digit or more and a -? If so, what does everything inside (?: ) do? Only thing I can think of is before it hits that group it will ONLY match a dot, a - or a set of digits and the group itself allows you to use any combination of them (ie -1.094).
Am I anywhere close? One other thing, you said it allows two decimal points, how can I only get it to accept one?
"Age is nothing more than an inaccurate number bestowed upon us at birth as just another means for others to judge and classify us"
sulfericacid | [reply] |
|
|
As a general rule, a good way to build complex regexes is to be long-winded. Don't optimize. List each case in full:
$interest !~ (
^ # start of line
-? # optionally negative
(?:
\d+ # numbers with no decimal point
| \d+\.\d+ # numbers with decimal point in middle
| \.\d+ # numbers with decimal point at start
| \d+\. # numbers with decimal point at end
)
$ # end of line
/x;
From this starting point, you can then attempt to optimize some the cases, if you want. But there's no need to: perl will optimize them for you.
--Dave | [reply] [d/l] |