Take a look at the documentation for regex quantifiers, and capture groups.
Your code will match numbers that come in multiples of 4 integers. For example something-1234.html will match as well as something-12341234.html. For matching only 4 digits, your pattern can be simplified to:
Note, that the + has been removed from your regex. Also, as your code is written $num will not contain the number. It will contain the whole URL. To get just the number, you need to get the value of the first capture group$url=~/(\d{4})\.htm/i;
$num = $1;
To allow for 4 or more digits, use the following
$url=~/(\d{4,})\.htm/i;
To allow for only 4 or 5 digits, use the following
$url=~/(\d{4,5})\.htm/i;
UPDATE:I really like the named capture groups feature that comes with perl versions 5.10 and greater. They can be overkill when you are only dealing with one or two groups, but can make the code much more clear if you are dealing with multiple capture groups.
#!/usr/bin/env perl use strict; use warnings; use v5.10; my $url = 'something-12345.html'; $url =~ /(?<num>\d{4,5})\.htm/i; my $num = $+{num}; print "$num\n"; exit;
In reply to Re: Grabbing numbers from a URL
by kevbot
in thread Grabbing numbers from a URL
by htmanning
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |