hiddenlinux has asked for the wisdom of the Perl Monks concerning the following question:

Hi all,

I am hopeless with all this regex stuff, and was hoping that someone might be able to fix my regex's, as i think they are a bit broken.

fullname should = 3-30 in length, contain only uper/lower case a-z, symbols - ' and spaces
username should = 3-16 in length, have a lower case a-z as a first character, and have lower /upper case characters, numbers and symbols - _ .
mothers should be the same as fullname email should roughly validate e-mail addresses (yes i kow about the modules, but cant install them, and i kow that _some_ valid addresses will get rejected using regex's) conditions should only = Yes
my %spec = ( fullname => qr/^ [a-zA-Z\s'-] {3,30} \z/x, username => qr/^ [a-z] [-.\w]{2,15} \z/x, mothers => qr/^ [a-zA-Z\s'-] {3,30} \z/x, email => qr/^ [\w\.-]+ \@ (?:[a-z\d-]+\.)+ [a-z\d]+ \z/ix, conditions => qr/^Yes\z/, );

Replies are listed 'Best First'.
Re: Regex Hell
by sauoq (Abbot) on Jul 08, 2003 at 20:30 UTC

    With the exception of the email regex, those should do what you want.

    You're on your own with email if you don't have access to one of the available modules that do that already. If you can't install one, you might consider asking the system administrator to install one for you. Barring that, copying and pasting might be a better option than trying to roll your own. Checking email addresses for validity is not trivial and I, for one, won't try to interpret what you mean by "roughly validate."

    -sauoq
    "My two cents aren't worth a dime.";
    
      Hi There, does \s allow tab's, character returns and other things like that? I don't want any smart-arse putting in any new-lines that would mess up my other script.
        does \s allow tab's, character returns and other things like that?

        Yes. It is the same as [\f\n\r\t ] (and it actually allows a few other characters if you are using Unicode.) You can use a literal space or an octal (\040) or hex (\x20) representation if you don't want to match the others. Since you are using /x on your regexes, I'd suggest an octal or hex escape.

        -sauoq
        "My two cents aren't worth a dime.";
        
Re: Regex Hell
by Cody Pendant (Prior) on Jul 08, 2003 at 21:26 UTC
    Your regexes appear to do what you want, though of course "\s" doesn't mean "space" it means "whitespace".

    So why do you think they're broken?



    “Every bit of code is either naturally related to the problem at hand, or else it's an accidental side effect of the fact that you happened to solve the problem using a digital computer.”
    M-J D
      Your regexes appear to do what you want
      Well, that would involve a bit of mindreading. In particular, I would supect that the email regex is very wrong, for reasons pointed out elsewhere in this thread. Probably about the best thing you can say about this hash is the useless tautology:
      Your regexes appear to do what they do.
      {grin}

      -- Randal L. Schwartz, Perl hacker
      Be sure to read my standard disclaimer if this is a reply.

        Yeah, point taken Merlyn.

        Everyone's got a bad email regex handy haven't they? I got mine from Matt Wright.

        No seriously, I was ignoring the specifics of the email regex for the generality of "why do you think they're broken"?



        “Every bit of code is either naturally related to the problem at hand, or else it's an accidental side effect of the fact that you happened to solve the problem using a digital computer.”
        M-J D
Re: Regex Hell
by Molt (Chaplain) on Jul 09, 2003 at 11:35 UTC

    If you have to do email addresses without a module I'd recommend looking at Friedl's solution.

    Friedl is the author of O'Reilly's Mastering Regular Expressions and seems to know his stuff to a quite remarkable degree, this email code is explained in the book if you want to know more about what's going on in it, and more about regular expressions in general.