in reply to Re: On Validating Email Addresses
in thread On Validating Email Addresses

merlyn already noted the TLD problem. But really, you're now being too generous. The real solution is to use Email::Valid, which contains a very large and complex regex, plus a few other validation routines.

As complex as that regex is, it still won't match embedded comments in the address, but that's usually not a problem.

"There is no shame in being self-taught, only in not trying to learn in the first place." -- Atrus, Myst: The Book of D'ni.

Replies are listed 'Best First'.
Re^3: On Validating Email Addresses
by diotalevi (Canon) on Jan 05, 2005 at 06:27 UTC

    What?! Email::Valid fails on embedded comments? That's an astonishingly common feature of actual email addresses in the wild. I managed a number of public inboxes for a global corporation for a few years and I had to take special care in my own email address parsing code (in a VB dialect) to handle comments.

    I mean, of the form (Fname Lname) <addr@example.com> and <addr@example.com> (Fname Lname). I never saw addr@example( ... ).com. Of those three forms, which are supported? Anything good will handle the first two and I don't think the third matters. I'm speaking only from what I saw in actual usage.

      s/embedded/nested/g

      The regex doesn't handle comments nested inside of comments. It does handle comments (one level deep only).

      - tye        

        I don't think I ever saw nested comments. Thanks for clarifying.

      Besdies tye's point below, I don't think it matters much in common usage of Email::Valid, anyway. I've only used it for validating form input, and I imagine this tends to be the most common case. How often do you type (Fname Lname) <addr@example.com> into a form? I always just type the address alone.

      "There is no shame in being self-taught, only in not trying to learn in the first place." -- Atrus, Myst: The Book of D'ni.

        Ok, so you won't type that into a form. Your email client will happily give that string to me and its still subject to validity tests. Email::Valid should handle addresses as generated by backend systems and not just what people are likely to type into a text box.
Re^3: On Validating Email Addresses
by Thilosophy (Curate) on Jan 05, 2005 at 01:53 UTC
    But really, you're now being too generous. The real solution is to use Email::Valid, which contains a very large and complex regex, plus a few other validation routines.

    Well, my point was that you cannot validate the email with a regular expression anyway. You are very unlikely to even catch typos. If my email is bill@microsoft.com and I mistype it as bikk@microsoft.com how is Email::Valid going to help you? So why bother at all?

    Concession: Email::Valid can also check if an MX entry exists for the domain. That might make sense in some situations (but it still does not check the user name -- is there a way to do this, too?)

      As you suggested, about the best possible check that you can hope to perform for the purposes of catching typos is to ask the user to type it twice.

      The problem with that is that I always copy&paste when I'm asked to do that, so if I type it incorrectly the first time, it just gets confirmed incorrectly.


      Examine what is said, not who speaks.
      Silence betokens consent.
      Love the truth but pardon error.
      Ah, but even checking the MX is fraught with danger. What if the name servers are offline for the moment, or the local nameservers are not working, or even that no local nameserver is configured due to security rules? Wait 30 seconds to go to the secondary? Return it as invalid? It's probably much better to send some sort of cookie to the e-mail address to continue, if having an e-mail address really is important.

      As an aside, + is valid in the username portion of the e-mail address, and I try to use it regularly, I really do. However, the only form I've found so far that actually accepts it (without causing problems) is the mailman interface.