htmanning has asked for the wisdom of the Perl Monks concerning the following question:

Can someone tell me the difference between these two snippets? I'm trying to recognize numbers in a variable and return an error if a number is detected. For example, there is no reason to have a number in the "name" field.
if ($var1 =~ /^[0-9]+$/); if ($var2 =~ /^\S{11,}$/);

Replies are listed 'Best First'.
Re: Recognizing numbers
by stevieb (Canon) on Jul 27, 2015 at 23:41 UTC

    Yes, but so can YAPE::Regex::Explain.

    #!/usr/bin/perl use warnings; use strict; use YAPE::Regex::Explain; print YAPE::Regex::Explain->new('^[0-9]+$')->explain(); print "\n\n"; print YAPE::Regex::Explain->new('^\S{11,}$')->explain(); __END__ The regular expression: (?-imsx:^[0-9]+$) matches as follows: NODE EXPLANATION ---------------------------------------------------------------------- (?-imsx: group, but do not capture (case-sensitive) (with ^ and $ matching normally) (with . not matching \n) (matching whitespace and # normally): ---------------------------------------------------------------------- ^ the beginning of the string ---------------------------------------------------------------------- [0-9]+ any character of: '0' to '9' (1 or more times (matching the most amount possible)) ---------------------------------------------------------------------- $ before an optional \n, and the end of the string ---------------------------------------------------------------------- ) end of grouping ---------------------------------------------------------------------- The regular expression: (?-imsx:^\S{11,}$) matches as follows: NODE EXPLANATION ---------------------------------------------------------------------- (?-imsx: group, but do not capture (case-sensitive) (with ^ and $ matching normally) (with . not matching \n) (matching whitespace and # normally): ---------------------------------------------------------------------- ^ the beginning of the string ---------------------------------------------------------------------- \S{11,} non-whitespace (all but \n, \r, \t, \f, and " ") (at least 11 times (matching the most amount possible)) ---------------------------------------------------------------------- $ before an optional \n, and the end of the string ---------------------------------------------------------------------- ) end of grouping ----------------------------------------------------------------------

    -stevieb

      Thank you for this. So am I correct that the first one detects numbers, and the second detects if there is more than 11 characters without a space? Thanks!

        Yes, that is correct, except is isn't more than 11, it's 11 or more. To elaborate, the first one is any digit, one or more times (greedy). The second is any non-whitespace character (letter, number, special char etc) a minimum of 11 times consecutively with no maximum specified (greedy).

        Also, [0-9] can be simplified with a single \d.

Re: Recognizing numbers
by Athanasius (Archbishop) on Jul 28, 2015 at 07:29 UTC

    Hello htmanning,

    I'm trying to recognize numbers in a variable and return an error if a number is detected. For example, there is no reason to have a number in the "name" field.

    I think this approach is, potentially, a maintenance nightmare in the making. Coming back to the code in, say, 6 months time: Did you mean to allow semicolons, commas, tab characters? To exclude the West Arabic (Latin alphabet) digits 1, 2, 3, ..., but allow their East Arabic equivalents ٢, ٣, ١, ...? If it’s at all possible, it will be much better to specify explicitly what a “name” is allowed to contain:

    my $name_re = qr/^[- A-Za-z]+$/; ... unless ($name =~ $name_re) { warn "Bad name '$name'"; ... }

    Then if, down the track, you find that additional characters are needed (e.g., should underscores be allowed?), they can be added once to $name_re, and the code remains clear and self-documenting.

    Hope that helps,

    Athanasius <°(((><contra mundum Iustus alius egestas vitae, eros Piratica,

Re: Recognizing numbers
by locked_user sundialsvc4 (Abbot) on Jul 29, 2015 at 02:12 UTC

    I also notice that the regexes are anchored to the start and/or to the end with the $ and ^ characters ... do you intend that?   These will search for a string that consists only of digits, not one that contains digits.