Do it like this:
First, it is almost always a bad idea to assign to $_ explicitly. I recommend against doing that. Better is:$_ = "I have 2 numbers: 53147"; if (/(.*?)(\d+)/) { print "Beginning is <$1>,number is <$2>.\n"; } #prints: Beginning is <I have >,number is <2>. # Don't tell me that $1 = "I have ". # Just execute the print statement and show the output.
What you have here is what is called a "regular expression" or "regex". m/(.*?)(\d+)/ (m or match is implicit). This regex means that we are going to match the minimum span of any characters (may, by the way even be zero characters) that still allows the next regex term to "match" if it is possible to do so.use warnings; use strict; my $string = "I have 2 numbers: 53147"; if ($string =~ /(.*?)(\d+)/) { print "Beginning is <$1>,number is <$2>.\n"; } #prints: Beginning is <I have >,number is <2>.
So basically, "(.*?)" means all characters up to but not including the first digit seen - the shortest string that doesn't include the first digit - note: this does include the space before the first digit seen. "(\d+)" means now that we have seen a digit, get me all digits that are sequential. This is how you get "I have " and then "2" for $1 and $2 respectively.
You should experiment when faced with a regex like this. Change the string to be say: "I have 6718 numbers: 53147" and see what that prints. It will print: Beginning is <I have >,number is <6718>. "2" has now become "6718", just like the previous paragraph would lead you to believe would happen.
Now, lets experiment more. That ? in the first capture term matters a lot! The ? "minimizes" the length of the match. Let's say that we have (no ? character):
That (.*) means: give me the maximal length string while still allowing (\d+) to match. Working from the right, "7" is the shortest thing that matches "one or more digits"(\d+) and sure enough (.*) matches everything in front of that. (.*) matches the longest thing that still allows (\d+) to match, albeit with just a single digit!my $string = "I have 2 numbers: 53147"; if ($string =~ /(.*)(\d+)/) { print "Beginning is <$1>,number is <$2>.\n"; } #prints: Beginning is <I have 2 numbers: 5314>,number is <7>.
Let's say that you knew that that were two numbers (sequences of digits) in this string.
The regex says: capture the first sequence of digits, ignore a sequence of one or more non-digits and then capture the next sequence of digits.my $string = "I have 325 numbers: 98765 12324"; if ($string =~ /(\d+)\D+(\d+)/) { print "Beginning is <$1>,number is <$2>.\n"; } #prints: Beginning is <325>,number is <98765>.
This whole business of regex can become VERY complicated. The classic book on this is: Mastering Regular Expressions by Jeffrey Friedl. Fortunately, the vast majority of regex's don't require anywhere near the knowledge required to understand Friedl's book!
In Perl:
is normally all you need to know along with some simple rules about minimal and maximal matches.\d, a digit[0-9] \D a non digit \w, a word character[a-zA-Z0-9_] \W a non word character \s, a white space char [\s\t\f\r\n] \S a non-whitespace char
In reply to Re^3: What is the output for this ??
by Marshall
in thread What is the output for this ??
by sreenath
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |