String validation

Speedfreak has asked for the wisdom of the Perl Monks concerning the following question:

I have a minor problem which I'm guessing could be done with a regex. I am being sent a parameter from a CGI form that I need to check is valid.

What I need to check is that:

The string is exactly 4 characters long, not one more, not one less. It must only contain numbers and letters. Also, there is a high chance the string will arrive in lowercase and I need to convert it to uppercase before my script can use it.

I need to redirect the user to an error page if the string was invalid rather than continue running the script.

Can anyone help me with this one?

- Jed

Comment on String validation

Replies are listed 'Best First'.
Re: String validation by chromatic (Archbishop) on Mar 24, 2000 at 21:19 UTC
One option: `if ($string =~ m!(^[a-z0-9]{4})$!) { $string =~ tr/[a-z]/[A-Z]/; # do something } else { # do some error }` [download] This looks for the start of a string, exactly four characters from the class between lowercase a and lowercase z or between 0 and 9 inclusive, storing them, and then the end of the string. If found, it uppercases it. (I use tr/// because it's faster than a substitution, and because leading digits may cause a warning about modifying a constant. uc is another option.)	[reply] [d/l]
Re: String validation by btrott (Parson) on Mar 25, 2000 at 02:22 UTC
Shouldn't that regex also match upper-case letters? Your regex will say that the string "A32g" is bad, but according to the problem description, the original poster wanted to match any letters, including (presumably) upper-case. So that string should match. So I would think you could just change that code to: `if ($string =~ m!(^[A-Za-z0-9]{4})$!) { $string =~ tr/[a-z]/[A-Z]/; # or $string = uc $string; # do something } else { # do some error }` [download] I did a bit of benchmarking of uc vs. tr, and the results were pretty similar: `Benchmark: timing 1000000 iterations of tr, uc... tr: 10 secs ( 7.49 usr 0.00 sys = 7.49 cpu) uc: 9 secs ( 6.60 usr 0.00 sys = 6.60 cpu)` [download] So they should be pretty much interchangeable, in terms of performance.	[reply] [d/l] [select]
You're right... by chromatic (Archbishop) on Mar 25, 2000 at 02:25 UTC
Let's make it a little more readable, then: `if ($string =~ m!^(\w{4})$!) { $string =~ tr/[a-z]/[A-Z]/; # or $string = uc $string; # do something } else { # do some error }` [download] Since the OP wants four alphanumerics (and they are merely likely to arrive in lowercase), we'll use \w. The reason I suggested using tr/// instead of uc is that, in my testing, uc() failed on a scalar starting with a digit. Now it seems to work correctly. Let's go back to: `if ($string =~ s!^(\w{4})$!uc($1)!e) { # do something } else { # do some error # print "Location: $error_url" }` [download]	[reply] [d/l] [select]
RE: You're right... by btrott (Parson) on Mar 25, 2000 at 02:36 UTC
Do you mean \w? \d just matches digits. I thought of using \w, but \w is alphanumerics plus '_', and I didn't know if the OP wanted '_'.	[reply]
Re: String validation by turnstep (Parson) on Mar 30, 2000 at 03:50 UTC
If they do NOT want the underscore, you should technically still try and use the \w anyway, due to the possible use of 'use locale': `if ($string !~ /_/ && $string =~ s!^(\w{4})$!uc($1)!e) {` [download]	[reply] [d/l]