in reply to Spam filtering regexp - keyword countermeasure countermeasure
First off, SPAM and SPAM filtering is an ever-escalating arms race. You should be so lucky to stay one step ahead of them. Are you able to use something like SpamBayes? Bayesian filtering is quickly becoming best way to deal with spam.
If you cannot run these tools or just plain insist on writing this script, perhaps a good tactic would be to remove all punctuation and spaces from the subject line and then use a list of SPAM-ish words(debt,enlarge,coed) and see if they are contained in the subject line. However, this idea will not handle ordinary acronyms. Another tactic might be to take that same list of SPAM-ish words and do stuff like inserting a check for non-alpha characters between each letter:
/d[^A-Za-z]*e[^A-Za-z]*b[^A-Za-z]*t/i
But that's woefully inefficient.
Just my 0.02
|
|---|