in reply to RegEx for email help

The reason your current regex won't work is that the astrix is greedy, i.e. it grabs everything it can. Since it is allowed to match anything, it grabs everything except the last bit of whitespace (which is required for the regex to match at all).

More of an overall issue is that matching an e-mail address is quite a bit harder than most people think. See Email::Valid, which contains the generally accepted regex for matching e-mail addresses (it's several thousand characters long, and it doesn't even match emebedded comments, as allowed by RFC 822).

----
I wanted to explore how Perl's closures can be manipulated, and ended up creating an object system by accident.
-- Schemer

: () { :|:& };:

Note: All code is untested, unless otherwise stated

Replies are listed 'Best First'.
Re: Re: RegEx for email help
by rob_au (Abbot) on Jan 14, 2004 at 09:35 UTC
    I concur with all of the comments made hardburn above - Matching email addresses is a much more complex task than what most people realise. Generally however I lean towards the use of Email::Valid::Loose in place of Email::Valid as this allows for better matching as per RFC2822, which supercedes RFC 822, and permits the . (period) character in the local-part portion of the email address.

    Additionally, depending upon your matching requirements, it may be worth modifying URI::Find to employ the regular expression from Email::Valid::Loose above ($Email::Valid::Loose::Addr_spec_re) to be employed for matching ($URI::scheme_re).

     

    perl -le "print unpack'N', pack'B32', '00000000000000000000001010101011'"