Email suffix matching with a negative look-ahead regexp

jxz has asked for the wisdom of the Perl Monks concerning the following question:

Replies are listed 'Best First'.
Re: Email suffix matching with a negative look-ahead regexp by BazB (Priest) on Mar 02, 2003 at 14:44 UTC
Rather than implementing an anti-spam filter yourself, have you looked at the fine selection of modules on CPAN? Mail::Audit and Mail::SpamAssassin make an extremely flexible and easy to use mail filtering toolset. Have a look at the various Mail:: modules for all your filtering needs. Cheers. BazB If the information in this post is inaccurate, or just plain wrong, don't just downvote - please post explaining what's wrong. That way everyone learns.	[reply]
Re: Email suffix matching with a negative look-ahead regexp by dws (Chancellor) on Mar 01, 2003 at 23:41 UTC
The regex is: .+\.(?!br\|com\|net\|org) If you're just using () for grouping, `.+\.(?:br\|com\|net\|org)` [download] might serve you better, though that's a pretty heavy-handed way to deal with your email.	[reply] [d/l]
Re: Re: Email suffix matching with a negative look-ahead regexp by xmath (Hermit) on Mar 01, 2003 at 23:54 UTC
dws: No no, he wants to reverse the test. The problem is that he uses a zero-width assertion, hence the \. is required to be at the end of the string.. so that obviously won't work jwx: basically your regex says the period may not be followed by 'br', 'com', 'net', or 'org' .. but you're also not permitting anything else to follow it. A solution, although not very pretty, would be to explicitly match a word after the period: `.+\.(?!br\|com\|net\|org)\w` [download] The zero-width assertion will make sure that the word doesn't begin with* 'br', 'com', 'net', or 'org'. I can't think of any top-level domains that begin with those but aren't equal to it, but if you're worried then this should work: `.+\.(?!(?:br\|com\|net\|org)$)\w` [download] BTW, as dws says.. a pretty heavy-handed way to deal with your email.. (I'm in 'nl' myself, so if I'd email you it would be discarded as spam?) •Update: changed `\w+` to `\w` to discard addresses that end in a period as "spam"	[reply] [d/l] [select]
Re: Re: Re: Email suffix matching with a negative look-ahead regexp by jxz (Acolyte) on Mar 02, 2003 at 00:08 UTC
Thanks, it's working! `junior:~$ perl -e 'die if "foo@spammer.tw"=~/^.+\.(?!br\|com\|net\|org)\w ++$/' Died at -e line 1.` [download] OT: I receive much spam from international domains, and with this regexp the msg is sent to a special folder. I don't have problems, because the mailing-lists emails are filtered before.	[reply] [d/l]
Re**4: Email suffix matching with a negative look-ahead regexp by xmath (Hermit) on Mar 02, 2003 at 00:13 UTC