Routing SPAM emails to a junk folder

darkphorm has asked for the wisdom of the Perl Monks concerning the following question:

Hello Monks, My mailsystem (Postfix) is currently set to deliver all spammy/infected emails to a global junk folder. As it stands, emails are gzip'ed and then dropped in this folder after they fail scanning. There doesn't seem to be much for options to deliver to a user-folder for SPAM, nor is there anything in the file naming which indicates the ultimate recipient (which I suppose makes sense, as there could be multiple recipients). Now, I've made a script which reads in the userbase to a hash, and subs in values from /etc/aliases as needed. It reads the GZIP'ed email (using Compress::Zlib) and identifies the recipient from the X-Envelope-To header and compares against valid users/aliases (probably won't work with multi-recipient mail, but that's a task I'll work on after. It does work for multi-user aliases aka lists) So... what I'd like to do is either:

Copy the email to the recipient(s) "~/Maildir/.Junk/new " folder
Move SPAM to a spool folder, and Symlink the email for all recipients from the main folder

Now, since the script will be running under cron and probably at times have a lot of junkmail to deal with, I may limit it to X messages per run. For deleting the SPAM email, I'm thinking that Symlinks are the way to go. As it would be:

Check the spool folder. If message is older than $EXPIRY (say, about 1 week), nuke it, or if message has no more symlinks (meaning that all linked user's have deleted it). The alternative, of course, is to scan the ~/Maildir/.Junk/cur and ~/Maildir/.Junk/new Either way has problems. Symlinks:

I'm not sure yet how it will behave if a user moves the email to another folder (will it copy/move the symlink, or create a new file in the other folder).
If the symlink is simply move as a symlink, deleting dated symlinks won't work as the user may have in fact relocated a message to inbox (say it was mislabelled as SPAM).
Also, symlinks won't work between filesystems, but this should be OK for the situation here

The issue with making physical copies of the file is much the same:

Delivering copies to user junk Maildirs will take up much more space
Individual user junkmail folder would need to be scanned for date/time of messages (for deletion), which is more of a pain

So, does anyone have a better way of doing this, or some pointers on how I'm doing it now?

Comment on Routing SPAM emails to a junk folder

Replies are listed 'Best First'.
Re: Routing SPAM emails to a junk folder by CountZero (Bishop) on Jul 20, 2004 at 18:24 UTC
Some tough design decisions here! Just let me tell you how I did it: my spam-detector labels all "spam"-mail with "`[SPAM]`" in the subject line and adds a new X-header "SPAM". Users can then use their own mail readers to catalogue, delete, or do whatever ... with their mail. Seems to work OK CountZero "If you have four groups working on a compiler, you'll get a 4-pass compiler." - Conway's Law	[reply] [d/l]
For server-side cleanup by darkphorm (Beadle) on Jul 20, 2004 at 18:33 UTC
That's fine for the clients - but the problem is more that SPAM is currently accouting for upwards to 25% of the used space in homedirs on the mailserver. Also, the clients can have to download hundreds of spammy messages in a day, bogging the internet connection and delaying the emails they want. Thus, I want to drop it into an IMAP spam folder (which can be accessed in the webmail) and segregate the spammy messages. If somebody is on vacation, their spam will be cleared after a week rather than building up.	[reply]
Re: For server-side cleanup by CountZero (Bishop) on Jul 20, 2004 at 21:08 UTC
Just a wild thought: give every user two e-mail addresses: the real one is "user@my.email.server" and the one for the spam is "spam.user@my.email.server". If you make the spam-box only available through webmail, your users do not have to download through all the spam-messages and you can easily delete these message after a reasonable expiry-period. CountZero "If you have four groups working on a compiler, you'll get a 4-pass compiler." - Conway's Law	[reply]