jlongino has asked for the wisdom of the Perl Monks concerning the following question:

I'm a novice when it comes to regexen so I'm hoping someone can tell me if I've left out anything, TIMTOWTDI, etc. As you can see I've tested the few things I could think of but I'm afraid that I might have missed something. I found some hideous ugly code using for (;;), substr and index that this will replace. The string must consist of two uppercase Alpha characters, no more, no less. This is a small snippet in a program that someone else has asked to look at. Thanks!
use strict; my @strs = ('ABC','aZ','CC','AZ','ZZ','?A','W1','A ',' ','..','A','1' +,''); foreach (@strs) { if (/^[A-Z]{2}$/) { print "PASS: "; } else { print "## FAIL: "; } print "'$_'\n"; }

--Jim

Replies are listed 'Best First'.
Re: Simple Regex for Alpha string
by Masem (Monsignor) on Nov 14, 2001 at 00:55 UTC
    /^[A-Z]{2}$/ is about as concise and explicit as you can get for your purposes; probably the only thing that might cause problems are higher order ASCII characters (like accented ones, or Unicode encodings), but for the straight-foward 7-bit character set, you've got it nailed.

    -----------------------------------------------------
    Dr. Michael K. Neylon - mneylon-pm@masemware.com || "You've left the lens cap of your mind on again, Pinky" - The Brain
    "I can see my house from here!"
    It's not what you know, but knowing how to find it if you don't know that's important

      Thanks Masem. I don't expect to encounter HO ASCII but then you never know. How would one handle that situation? If this requires a too lengthy reply feel free to direct me to an appropriate doc or whatever. Maybe Japhy references something like this in his soon-to-be book?

      --Jim

        For handling non-ASCII characters you can do a use locale; (see perllocale) and use the POSIX character class "upper": /^[[:upper:]]{2}$/
(tye)Re: Simple Regex for Alpha string
by tye (Sage) on Nov 14, 2001 at 01:40 UTC

    Note that "AB\n" will pass your test. This is often a nice convenience, but you can use /^[A-Z]{2}\z/ to disallow trailing newlines.

            - tye (but my friends call me "Tye")
      That would not be good in this case, but fortunately I think the string that is tested is chomped earlier in the program. But this is something really good to know and could probably drive a regex novice insane during the debugging process. Thanks!

      --Jim