in reply to Re: Email suffix matching with a negative look-ahead regexp
in thread Email suffix matching with a negative look-ahead regexp

dws: No no, he wants to reverse the test.

The problem is that he uses a zero-width assertion, hence the \. is required to be at the end of the string.. so that obviously won't work

jwx: basically your regex says the period may not be followed by 'br', 'com', 'net', or 'org' .. but you're also not permitting anything else to follow it. A solution, although not very pretty, would be to explicitly match a word after the period:

.+\.(?!br|com|net|org)\w*

The zero-width assertion will make sure that the word doesn't begin with 'br', 'com', 'net', or 'org'. I can't think of any top-level domains that begin with those but aren't equal to it, but if you're worried then this should work:

.+\.(?!(?:br|com|net|org)$)\w*

BTW, as dws says.. a pretty heavy-handed way to deal with your email.. (I'm in 'nl' myself, so if I'd email you it would be discarded as spam?)

•Update: changed \w+ to \w* to discard addresses that end in a period as "spam"

Replies are listed 'Best First'.
Re: Re: Re: Email suffix matching with a negative look-ahead regexp
by jxz (Acolyte) on Mar 02, 2003 at 00:08 UTC
    Thanks, it's working!

    junior:~$ perl -e 'die if "foo@spammer.tw"=~/^.+\.(?!br|com|net|org)\w ++$/' Died at -e line 1.

    OT: I receive much spam from international domains, and with this regexp the msg is sent to a special folder. I don't have problems, because the mailing-lists emails are filtered before.

      OT, a good way to test patterns like these is:
      perl -lne 'print /^.+\.(?!br|com|net|org)\w+$/ ? "spam" : "ok"'

      Every line of input will be tested, so you can try out various email addresses to see if it matches properly

      Maybe you already knew the -n option, but I thought I'd mention it in case you don't