in reply to Re: email regex
in thread email regex

Okay, looking at this in another light ... How does this substitution, which in theory strips the domain and route portion of the email address, fair.

( $domain ) = $email =~ /\@([^\@]+)$/;

Note that this substitution does not handle RFC822 addresses which contains elements beyond the local_part@domain atoms (however this may be considered beyond the scope of this question).

Update - Okay, updated using a subset of the RFC822 email address regular expression from Mastering Regular Expressions to strip off the trailing domain portion to leave the "username" element ...

$email =~ s/(?:\@[\040\t]*(?:(?:[^(\040)<>@,;:"\\\[\]\000-\037\x80-\xf +f]|(?:\[(\040)<>@,;:"\\\[\]\000-\037\x80-\xff]))+(?!(?:[^(\040)<>@,;: +"\\\[\]\000-\037\x80-\xff]|(?:\[(\040)<>@,;:"\\\[\]\000-\037\x80-\xff +])))|\[(?:[^\\\x80-\xff\n\015\[\]]|\\[^\x80-\xff])*\])[\040\t]*(?:\([ +^\\\x80-\xff\n\015()]*(?:(?:\\[^\x80-\xff]|\([^\\\x80-\xff\n\015()]*( +?:\\[^\x80-\xff][^\\\x80-\xff\n\015()]*)*\))[^\\\x80-\xff\n\015()]*)* +\)[\040\t]*)*(?:\.[\040\t]*(?:\([^\\\x80-\xff\n\015()]*(?:(?:\\[^\x80 +-\xff]|\([^\\\x80-\xff\n\015()]*(?:\\[^\x80-\xff][^\\\x80-\xff\n\015( +)]*)*\))[^\\\x80-\xff\n\015()]*)*\)[\040\t]*)*(?:(?:[^(\040)<>@,;:"\\ +\[\]\000-\037\x80-\xff]|(?:\[(\040)<>@,;:"\\\[\]\000-\037\x80-\xff])) ++(?!(?:[^(\040)<>@,;:"\\\[\]\000-\037\x80-\xff]|(?:\[(\040)<>@,;:"\\\ +[\]\000-\037\x80-\xff])))|\[(?:[^\\\x80-\xff\n\015\[\]]|\\[^\x80-\xff +])*\])[\040\t]*(?:\([^\\\x80-\xff\n\015()]*(?:(?:\\[^\x80-\xff]|\([^\ +\\x80-\xff\n\015()]*(?:\\[^\x80-\xff][^\\\x80-\xff\n\015()]*)*\))[^\\ +\x80-\xff\n\015()]*)*\)[\040\t]*)*)*)$//;

 

perl -le 'print+unpack"N",pack"B32","00000000000000000000001001111011"'

Replies are listed 'Best First'.
Re: email regex
by Abigail-II (Bishop) on Aug 19, 2003 at 12:04 UTC
    How does this substitution, which in theory strips the domain and route portion of the email address, fair

    I think you already answered your own question:

    Note that this substitution does not handle RFC822 addresses which contains elements beyond the local_part@domain atoms

    So, you tell us. For which (subset of RFC822?) email addresses does your regexp work?

    Abigail