in reply to Re: Getting Email environment address
in thread Getting Email environment address

I wrote a regexp a little while ago that will validate an email pretty well. It allows for pretty much anything before the '@' except for whitespace and '@', then allows alphanumerics and hyphens in the domain name (with the same rule for n subdomains), then an alphabetical tld. I figured it would be a waste of time to maintain a list of all the tlds, so I just check to make sure the tld has two or 3 letters, or is 'info' or 'museum'.

/^[^@\s]+@([a-z0-9\-]+\.)+([a-z]{2,3}||info||museum)$/i

I would very much like to know if anyone can offer improvements on this, as I plan to use it again soon, but I've tried quite a number of garbage addresses and they're stopped. A limitation of this regexp is that characters like ';' and ',' are not stopped in the first part, so someone could theoretically send extra addresses -- but the '@' is stopped. So, someone could theoretically get your program to send mail to local accounts, or just generate errors. I felt this was an okay compromise, but to stop this you could change the regexp to:

/^[^@\s;,]+@([a-z0-9\-]+\.)+([a-z]{2,3}||info||museum)$/i

Of course, if you really want to validate the address, you could always do a DNS lookup on the domain name (don't laugh, I know people who've done it), but I can't think of any way to check if the address is valid...


LAI
:eof

Replies are listed 'Best First'.
Re: Getting Email environment address
by Abigail-II (Bishop) on Nov 15, 2002 at 14:44 UTC
    foo" "bar@abigail.nl is a valid address, but it will be stopped by your regex. There are a few modules on CPAN that check addresses against RFC 822, for instance RFC::RFC822::Address.

    Abigail

      Thanks Abigail-II; I had no idea that sort of address is valid. Perhaps I should have RTFM!

      Now to read up on RFC 822...


      LAI
      :eof