Re: Matching numbers by regex.

* is greedy - it matches as many characters as it can, but it can match none at all. In (\d+).*(\d*\d+) the \d* is redundant (the following \d+ matches at least 1 digit and as many as it may) and the .* before it matches as many charactes as it can including all except one digit (the \d+ grabs one digit). One way to fix the problem is:

use strict;
use warnings;

my $data = "Exlief 4    page : 1 /10";
my $match = qr/pag\w+\s*:\s*(\d+)[^\d]*(\d+)/;

print "Pages : $1 / $2\n" if $data =~ $match;

$data = "Exlief 4    page : 1 / 5";

print "Pages : $1 / $2\n" if $data =~ $match;
[download]

Prints:

Pages : 1 / 10
Pages : 1 / 5
[download]

Note that a precompiled regex is used to save retyping (perhaps differently) the regex and that the 'match any character' has been replaced by 'match any character except a digit' and that the redundant digit match has been removed.

DWIM is Perl's answer to Gödel

Comment on Re: Matching numbers by regex. Select or Download Code

Replies are listed 'Best First'.
Re^2: Matching numbers by regex. (remember \D) by grinder (Bishop) on Apr 19, 2006 at 10:26 UTC
In the above code (and in the other replies in the thread), `[^\d]` [download] may be represented with `\D` [download] and will be more efficient as well, since it avoids calls to `utf8::IsDigit` internally. • another intruder with the mooring in the heart of the Perl	[reply] [d/l] [select]
Re^3: Matching numbers by regex. (remember \D) by GrandFather (Saint) on Apr 19, 2006 at 10:34 UTC
Heh, good point! I do tend to forget the uppercase versions of the character set match flags such as `\D \W \S`. Thanks for the reminder. DWIM is Perl's answer to Gödel	[reply] [d/l]