in reply to Is Number

G'day GS_ICT,

"This is our procedure ..."

-- No it isn't!

It's perfectly OK to use code that others have posted here: we're all about sharing. It is not OK to claim that it's yours: that's plagiarism.

I as far I can see, the code you posted is a verbatim copy of what's in "Is A Number". On what basis do you say "It is not elegant"? Is that your considered opinion or just a paraphrase of the original "This is anything but elegant"?

Read "How do I change/delete my post?" then update your post to include something like "I took this code from ...". See "What shortcuts can I use for linking to other information?" for ways to link to that content.

For the solution you seek, the first thing I'd do is throw out all of the blacklist tests. As far as I can see, there are only 13 valid characters: 10 ASCII digits, a plus, a minus, and a dot. Unicode has well over 130,000 characters — are you planning to blacklist all (but 13) of those and keep that list up to date as new versions are published?

Create a list of representative valid and invalid numbers. Think about edge cases, such as leading, embedded and trailing whitespace.

Now create a regex that matches all of the listed valid numbers but none of the invalid ones. Write the regex in a way that is readable, comprehensible, and easily maintained (see code example below). I don't believe that this needs to be particularly long or complicated.

Now write a test script. An example follows. Note that I've intentionally included two tests that fail; this is for demonstration purposes.

#!/usr/bin/env perl use strict; use warnings; use constant { NUM => 0, EXP => 1, }; use Test::More; my @tests = ( [1, 1], [0, 1], ['NaN', 0], ['Infinity', 0], ['-123.', 0], ['-123. ', 0], ['-123.0', 1], ['-123 ', 0], ['000', 0], ['007', 0], ); plan tests => 0+@tests; my $re = qr{(?x: ^ # anchor at start [+-]? # optional leading sign [0-9]+ # 1 or more ASCII digits (?: \. # literal decimal place [0-9]+ # 1 or more ASCII digits | # OR # nothing ) $ # anchor end )}; for my $test (@tests) { is $test->[NUM] =~ /$re/, !!$test->[EXP], "Test that '$test->[NUM]' is " . ($test->[EXP] ? 'VALID' : 'INVALID'); }

That outputs:

1..10 ok 1 - Test that '1' is VALID ok 2 - Test that '0' is VALID ok 3 - Test that 'NaN' is INVALID ok 4 - Test that 'Infinity' is INVALID ok 5 - Test that '-123.' is INVALID ok 6 - Test that '-123. ' is INVALID ok 7 - Test that '-123.0' is VALID ok 8 - Test that '-123 ' is INVALID not ok 9 - Test that '000' is INVALID # Failed test 'Test that '000' is INVALID' # at ./pm_11146401_is_number.pl line 42. # got: '1' # expected: '' not ok 10 - Test that '007' is INVALID # Failed test 'Test that '007' is INVALID' # at ./pm_11146401_is_number.pl line 42. # got: '1' # expected: '' # Looks like you failed 2 tests of 10.

If you believe multiple leading zeros are valid, change the test data (e.g. ['007', 0] to ['007', 1]). Otherwise, change the regex such that those test data do not match.

If you encounter problems with any part of this coding exercise: read "How do I post a question effectively?"; post your code, data and output; clearly explain the difficulties you're experiencing. We will endeavour to help you.

— Ken