Recognizing numbers

htmanning has asked for the wisdom of the Perl Monks concerning the following question:

Replies are listed 'Best First'.

Re: Recognizing numbers
by stevieb (Canon) on Jul 27, 2015 at 23:41 UTC

Yes, but so can YAPE::Regex::Explain.

#!/usr/bin/perl
use warnings;
use strict;

use YAPE::Regex::Explain;

print YAPE::Regex::Explain->new('^[0-9]+$')->explain();
print "\n\n";
print YAPE::Regex::Explain->new('^\S{11,}$')->explain();

__END__
The regular expression:

(?-imsx:^[0-9]+$)

matches as follows:
  
NODE                     EXPLANATION
----------------------------------------------------------------------
(?-imsx:                 group, but do not capture (case-sensitive)
                         (with ^ and $ matching normally) (with . not
                         matching \n) (matching whitespace and #
                         normally):
----------------------------------------------------------------------
  ^                        the beginning of the string
----------------------------------------------------------------------
  [0-9]+                   any character of: '0' to '9' (1 or more
                           times (matching the most amount possible))
----------------------------------------------------------------------
  $                        before an optional \n, and the end of the
                           string
----------------------------------------------------------------------
)                        end of grouping
----------------------------------------------------------------------


The regular expression:

(?-imsx:^\S{11,}$)

matches as follows:
  
NODE                     EXPLANATION
----------------------------------------------------------------------
(?-imsx:                 group, but do not capture (case-sensitive)
                         (with ^ and $ matching normally) (with . not
                         matching \n) (matching whitespace and #
                         normally):
----------------------------------------------------------------------
  ^                        the beginning of the string
----------------------------------------------------------------------
  \S{11,}                  non-whitespace (all but \n, \r, \t, \f,
                           and " ") (at least 11 times (matching the
                           most amount possible))
----------------------------------------------------------------------
  $                        before an optional \n, and the end of the
                           string
----------------------------------------------------------------------
)                        end of grouping
----------------------------------------------------------------------
[download]

-stevieb

[reply]
[d/l]
[select]

Re^2: Recognizing numbers

by htmanning (Friar) on Jul 27, 2015 at 23:49 UTC

Thank you for this. So am I correct that the first one detects numbers, and the second detects if there is more than 11 characters without a space? Thanks!

[reply]

Re^3: Recognizing numbers

by stevieb (Canon) on Jul 27, 2015 at 23:56 UTC

Yes, that is correct, except is isn't more than 11, it's 11 or more. To elaborate, the first one is any digit, one or more times (greedy). The second is any non-whitespace character (letter, number, special char etc) a minimum of 11 times consecutively with no maximum specified (greedy).

Also, [0-9] can be simplified with a single \d.

[reply]
[d/l]
[select]

Re^4: Recognizing numbers

by afoken (Chancellor) on Jul 28, 2015 at 14:25 UTC

Re^5: Recognizing numbers

by 1nickt (Canon) on Jul 29, 2015 at 02:41 UTC

Re: Recognizing numbers
by Athanasius (Archbishop) on Jul 28, 2015 at 07:29 UTC

Hello htmanning,

I'm trying to recognize numbers in a variable and return an error if a number is detected. For example, there is no reason to have a number in the "name" field.

I think this approach is, potentially, a maintenance nightmare in the making. Coming back to the code in, say, 6 months time: Did you mean to allow semicolons, commas, tab characters? To exclude the West Arabic (Latin alphabet) digits 1, 2, 3, ..., but allow their East Arabic equivalents ٢, ٣, ١, ...? If it’s at all possible, it will be much better to specify explicitly what a “name” is allowed to contain:

my $name_re = qr/^[- A-Za-z]+$/;

...

unless ($name =~ $name_re)
{
    warn "Bad name '$name'";
    ...
}
[download]

Then if, down the track, you find that additional characters are needed (e.g., should underscores be allowed?), they can be added once to $name_re, and the code remains clear and self-documenting.

Hope that helps,

Athanasius <°(((>< contra mundum Iustus alius egestas vitae, eros Piratica,

[reply]
[d/l]
[select]

Re: Recognizing numbers
by locked_user sundialsvc4 (Abbot) on Jul 29, 2015 at 02:12 UTC

I also notice that the regexes are anchored to the start and/or to the end with the $ and ^ characters ... do you intend that? These will search for a string that consists only of digits, not one that contains digits.