kiat has asked for the wisdom of the Perl Monks concerning the following question:

Hi monks,

I'm wondering if it's possible to have a single regex that accepts a string if it's one of the following:

a) letters only
b) letters plus underscore
c) letters with numbers
d) letters plus underscore plus numbers

Strings that contain purely numbers or numbers with underscores are unacceptable.

Thanks in anticipation :)

Update: Thanks! Funny, I kept thinking it must be done with two regexes. Btw, from Re: A single regex, I'm wondering if it would still be doable with a single regex with one more constraint added:

Maximally one underscore.

Update: Wow, regex is really quite an art and a science in itself. Thanks to all for helping :)

Replies are listed 'Best First'.
Re: A single regex
by ccn (Vicar) on Sep 10, 2004 at 14:29 UTC

    /^\w*[A-Za-z]\w*$/

    It means at least one letter and any optional numbers or undescores

Re: A single regex
by BrowserUk (Patriarch) on Sep 10, 2004 at 14:34 UTC

    Add some more testcases to this:

    m[^(?=.*[a-zA-Z])[\w]+$] and print "$_ : ok" for qw[ 123 123_ 123_a a_ a_123 abc _ a123]; 123_a : ok a_ : ok a_123 : ok abc : ok a123 : ok

    Tweaked: The \d was redundant.


    Examine what is said, not who speaks.
    "Efficiency is intelligent laziness." -David Dunham
    "Think for yourself!" - Abigail
    "Memory, processor, disk in that order on the hardware side. Algorithm, algorithm, algorithm on the code side." - tachyon
Re: A single regex
by Roy Johnson (Monsignor) on Sep 10, 2004 at 15:27 UTC
    With the maximum-of-one-underscore constraint:
    /^[a-z0-9]*(?:[a-z][a-z0-9]*_?|_[a-z0-9]*[a-z])[a-z0-9]*$/i
    It's either going to have a letter before an (optional) underscore, or it's going to have an underscore before a letter. Surrounding and intervening characters can be letters or numbers.

    A lookahead solution would check that there's no more than one optional underscore, and that there's at least one letter:

    /^(?=[^_]*_?[^_]*$)\w*[a-z]\w*$/i;
    In both cases, you're kind of doing multiple regexes.

    Caution: Contents may have been coded under pressure.
Re: A single regex
by Sidhekin (Priest) on Sep 10, 2004 at 15:49 UTC

    "Maximally one underscore", like "maximally n anything", is easiest done adding a negative lookahead after the start anchor:

    /^(?!.*_.*_)\w*[A-Za-z]\w*$/

    (That is also the most readable way, IMO, and not too slow, though if execution speed is an issue, doing it as a single regex is probably wrong anyway.)

    print "Just another Perl ${\(trickster and hacker)},"
    The Sidhekin proves Sidhe did it!

Re: A single regex
by BrowserUk (Patriarch) on Sep 10, 2004 at 15:22 UTC
    I'm wondering if it would still be doable with a single regex with one more constraint added: Maximally one underscore.

    Yes.

    m[^(?!=.*_.*_)(?=.*[a-zA-Z])[\w]+$]

    Corrected. Thanks Roy Johnson


    Examine what is said, not who speaks.
    "Efficiency is intelligent laziness." -David Dunham
    "Think for yourself!" - Abigail
    "Memory, processor, disk in that order on the hardware side. Algorithm, algorithm, algorithm on the code side." - tachyon
Re: A single regex
by graff (Chancellor) on Sep 10, 2004 at 17:14 UTC
    With the limit of zero or one underscore, it becomes a case where I'd prefer two separate conditions -- it just seems easier (and is probably faster):
    ( tr/_/_/ <= 1 and /^\w*[A-Za-z]\w*$/ )
Re: A single regex
by Anonymous Monk on Sep 10, 2004 at 14:42 UTC
    /^[[:word:]]*[[:alpha:]][[:word:]]*$/
    This accepts words with two underscores, it isn't clear from your description whether more than one underscore is allowed. Basically, your requirements reduce to: letters, numbers, underscores, with at least one letter. Which means you have zero or more letters, numbers, and underscores, then a letter, then again zero or more letters, numbers, and underscores.
Re: A single regex
by Crian (Curate) on Sep 10, 2004 at 14:31 UTC
    someone else was faster