GS_ICT has asked for the wisdom of the Perl Monks concerning the following question:

Hi This is our procedure to check if data is a valid number. It works well. It is not elegant so hopefully someone will be able to come up with a updated version for all Perl users that works just as well.
sub IsNumber { my ($string) = @_; $string =~s/ //g; my $valid = 0; my $count = $string =~ tr/\.//; if ( $string =~ m/[a-zA-Z\ \[\]]/ ) { $valid = 0; } elsif ( $string =~ /[^\x00-\x7F]/ ) { $valid = 0; } elsif ( $count > 1 ) { $valid = 0; } elsif ( $string =~ m/[#@':;><,.{}[]=!"£$%^&*()]/ ) { $valid = 0; } elsif ( $string =~ m/^[+-]?\d+$/ ) { $valid = 1; } elsif ( $string =~ m/^[+-]?[0-9]+[.]?[0-9]+/ ) { $valid = 1; } return $valid; }

Replies are listed 'Best First'.
Re: Is Number
by hippo (Archbishop) on Aug 25, 2022 at 08:57 UTC

    See Is A Number and replies for a fairly recent discussion on the same topic.


    🦛

Re: Is Number
by kcott (Archbishop) on Aug 25, 2022 at 13:34 UTC

    G'day GS_ICT,

    "This is our procedure ..."

    -- No it isn't!

    It's perfectly OK to use code that others have posted here: we're all about sharing. It is not OK to claim that it's yours: that's plagiarism.

    I as far I can see, the code you posted is a verbatim copy of what's in "Is A Number". On what basis do you say "It is not elegant"? Is that your considered opinion or just a paraphrase of the original "This is anything but elegant"?

    Read "How do I change/delete my post?" then update your post to include something like "I took this code from ...". See "What shortcuts can I use for linking to other information?" for ways to link to that content.

    For the solution you seek, the first thing I'd do is throw out all of the blacklist tests. As far as I can see, there are only 13 valid characters: 10 ASCII digits, a plus, a minus, and a dot. Unicode has well over 130,000 characters — are you planning to blacklist all (but 13) of those and keep that list up to date as new versions are published?

    Create a list of representative valid and invalid numbers. Think about edge cases, such as leading, embedded and trailing whitespace.

    Now create a regex that matches all of the listed valid numbers but none of the invalid ones. Write the regex in a way that is readable, comprehensible, and easily maintained (see code example below). I don't believe that this needs to be particularly long or complicated.

    Now write a test script. An example follows. Note that I've intentionally included two tests that fail; this is for demonstration purposes.

    #!/usr/bin/env perl use strict; use warnings; use constant { NUM => 0, EXP => 1, }; use Test::More; my @tests = ( [1, 1], [0, 1], ['NaN', 0], ['Infinity', 0], ['-123.', 0], ['-123. ', 0], ['-123.0', 1], ['-123 ', 0], ['000', 0], ['007', 0], ); plan tests => 0+@tests; my $re = qr{(?x: ^ # anchor at start [+-]? # optional leading sign [0-9]+ # 1 or more ASCII digits (?: \. # literal decimal place [0-9]+ # 1 or more ASCII digits | # OR # nothing ) $ # anchor end )}; for my $test (@tests) { is $test->[NUM] =~ /$re/, !!$test->[EXP], "Test that '$test->[NUM]' is " . ($test->[EXP] ? 'VALID' : 'INVALID'); }

    That outputs:

    1..10 ok 1 - Test that '1' is VALID ok 2 - Test that '0' is VALID ok 3 - Test that 'NaN' is INVALID ok 4 - Test that 'Infinity' is INVALID ok 5 - Test that '-123.' is INVALID ok 6 - Test that '-123. ' is INVALID ok 7 - Test that '-123.0' is VALID ok 8 - Test that '-123 ' is INVALID not ok 9 - Test that '000' is INVALID # Failed test 'Test that '000' is INVALID' # at ./pm_11146401_is_number.pl line 42. # got: '1' # expected: '' not ok 10 - Test that '007' is INVALID # Failed test 'Test that '007' is INVALID' # at ./pm_11146401_is_number.pl line 42. # got: '1' # expected: '' # Looks like you failed 2 tests of 10.

    If you believe multiple leading zeros are valid, change the test data (e.g. ['007', 0] to ['007', 1]). Otherwise, change the regex such that those test data do not match.

    If you encounter problems with any part of this coding exercise: read "How do I post a question effectively?"; post your code, data and output; clearly explain the difficulties you're experiencing. We will endeavour to help you.

    — Ken

Re: Is Number
by syphilis (Archbishop) on Aug 25, 2022 at 12:34 UTC
    hopefully someone will be able to come up with a updated version for all Perl users that works just as well

    The thing is that perl users might not all agree on what IsNumber() should return for all inputs.
    It can often depend upon what the intended usage is.

    A good starting point is probably to use Scalar::Util::looks_like_number() and then tweak the results to allow for any cases where that function's return does not suit your needs.
    For example:
    >perl -MScalar::Util="looks_like_number" -we "print 'WTF' if looks_lik +e_number('0xa.8');" >
    That's probably what you want since perl is only going to numify that string to 0. But if you're going to pass the string '0xa.8' to a function that handles such an input as intended, then you'd want your IsNumber() subroutine to return TRUE.
    >perl -MMath::MPFR -wle "print Math::MPFR->new('0xa.8');" 1.05e1 >
    Other points of contention might be whether your IsNumber() function should return true for numeric objects like Math::BigInt, Math::BigFloat and Math::BigRat objects ... or maybe you don't even have to consider such inputs.

    Cheers,
    Rob
Re: Is Number
by harangzsolt33 (Deacon) on Aug 26, 2022 at 00:50 UTC
    I use the following code. It relies on Perl itself to decide if a string is a number or not. If Perl thinks it's a number, then it returns 1, otherwise zero. Works for big ints too (returns 1 for a big int).

    sub isNumber { defined $_[0] or return 0; my $R = 1; local $SIG{__WARN__} = sub { $R = 0; }; return ($_[0] < 0) ? $R : $R; }
      sub isNumber { defined $_[0] or return 0; my $R = 1; local $SIG{__WARN__} = sub { $R = 0; }; return ($_[0] < 0) ? $R : $R; }

      Interesting. I notice that it relies on warnings being enabled ... though that's hardly a criticism of it.
      However, it doesn't aways agree with looks_like_number():
      use strict; use warnings; use Test::More; use Scalar::Util qw(looks_like_number); my $x = 42; my $y = 'not a number'; cmp_ok(looks_like_number(\$x), '==', isNumber(\$x), 'agrees re ref to +number'); cmp_ok(looks_like_number(\$y), '==', isNumber(\$y), 'agrees re ref to +non-number'); done_testing(); sub isNumber { defined $_[0] or return 0; my $R = 1; local $SIG{__WARN__} = sub { $R = 0; }; return ($_[0] < 0) ? $R : $R; } __END__ Outputs: not ok 1 - agrees re ref to number # Failed test 'agrees re ref to number' # at IsNumber.pl line 10. # got: # expected: 1 not ok 2 - agrees re ref to non-number # Failed test 'agrees re ref to non-number' # at IsNumber.pl line 11. # got: # expected: 1 1..2 # Looks like you failed 2 tests of 2.
      Which one is correct ?

      Cheers,
      Rob
        oh wait... you did. "it looks like looks_like_number() is better."