Detecting bounced mails

Corion has asked for the wisdom of the Perl Monks concerning the following question:

I get bounced mails. Lots of them. And most of them aren't even for mails that originated within my system. So I decided to write me a Mail::Audit plugin, which decides whether a mail is actually a bounce, to sort the bounce into a special folder. Later on, I'll add a check against the message ID of the mail and delete all mails that were bounced but did not originate on my system...

But I only have a shallow understanding of email headers, and maybe I'm doing this in a completely wrong way, and there is a much easier way to specify what I want. So without further ado, here is my module. Comments welcome!

package Mail::Audit::Bounce;
use Mail::Audit;
use vars qw(@VERSION);
$VERSION = '0.01';
1;

=head 1

Mail::Audit::Bounce - Recognize a mail as a bounce mail

=cut

package Mail::Audit;
use strict;

my $content_type = qr(report-type=delivery-status);
my $sender       = qr(Mail Delivery Subsystem <MAILER-DAEMON\@);

my @headers = (
  'X-Virus-Found'        => [ qr/./, "Virus found"],        # header m
+ust be present
  'X-Tnz-Problem-Type'        => [ qr/^40$/, "Virus found"],
  'X-Failed-Recipients'        => [ qr/./, undef],
  'Auto-Submitted'        => [ qr<auto-generated \(failure\)>,undef],
  'Content-Type'        => [ qr<multipart/report; report-type=delivery
+-status;>ism,undef],
  'Content-Type'        => [ qr<multipart/report; report-type=(["'])de
+livery-status\1;>ism,undef],
  'Received'            => [ qr<\(qmail \d+ invoked for bounce\);>sm, 
+undef],
);

sub __scan_delivery_report {
  my ($self) = @_;
  my @result;

  my @body;
  for my $line (@{$self->body}) {
    push @body, split /\n/, $line;
  };

  my $i = 0;
  while ($i < @headers) {
    my ($header,$r) = (@headers[$i,$i+1]);
    my ($content,$reason) = @$r;
    my @h = ($self->head->get($header), map { /^$header: (.*)$/ ? $1 :
+ () } @body);
    for my $line (@h) {
      push @result, $reason
        if ($line =~ $content);
    };
    $i += 2;
  };

  @result;
};

sub __mime_parts {
  my ($self,$content_type) = @_;
  grep { ($_->head->get('Content-Type')||"") =~ /$content_type/i } ($s
+elf->parts)
};

sub is_bounce {
  my ($self) = @_;
  my @parts;
  my @reasons;

  push @reasons, "mailer daemon"
    if $self->from =~ $sender;
  my $i = 0;
  while ($i < @headers) {
    my ($header,$r) = (@headers[$i,$i+1]);
    my ($content,$reason) = @$r;
    my @h = $self->head->get($header);
    for my $line (@h) {
      push @reasons, $reason
        if ($line =~ $content);
    };
    $i += 2;
  };

  if ($self->is_mime) {
    @parts = $self->parts;
    foreach my $part ($self->__mime_parts('message/delivery-status')) 
+{
      push @reasons, __scan_delivery_report($part)
    };
  };

  if (@reasons) {
    @reasons = grep { defined $_ } @reasons;
    @reasons = "unknown" unless @reasons;
  };

  @reasons;
};

=head2 C<< $message->original_message_id >>

This tries to find the original message id for a bounced
message. C<< is_bounce >> should be true before you ask
for the original message id.

It returns a list of candidates for the original message id.

=cut

sub original_message_id {
  my ($self) = @_;
  my %result;

  if ($self->is_mime) {
    # let's hope for the original message in
    # the 'message/rfc822' part
    foreach my $part ($self->__mime_parts('message/rfc822'),$self->__m
+ime_parts('text/rfc822-headers')) {
      for (map { /^Message-Id: (.*)$/i ? $1 : () } @{ $part->body }) {
        $result{$_} = 1
      };
    };
  } else {
    for (map { /^Message-Id: (.*)$/i ? $1 : () } @{ $self->body }) {
      $result{$_} = 1
    };
  };

  keys %result;
};
[download]

In the module, I consider stuff a bounce that:

Comes from MAILER-DAEMON@* or
Contains one of the above listed headers with a value matching the regular expression

The method of collecting and checking the headers seems clumsy and inflexible to me, but except for InterScan NT virus bounces, it has proven to be effective for the kinds of bounces I get. If anyone has a more elegant method, it is welcomed.

Possibly, some SpamAssassin rules could also do the same for me, but I haven't gotten around installing and using it - I'm still using my hand-made mail sorting script as my first line of defense.

Comment on Detecting bounced mails Download Code

Replies are listed 'Best First'.
Re: Detecting bounced mails by matija (Priest) on Mar 07, 2004 at 21:10 UTC
I can't see anything wrong with how you parse the headers, offhand. But I'd like to warn you that if you throw away all bounces, you will not know when a mail of yours gets bounced. Speaking for myself, I wouldn't dare do that: I mistype an email address at least once a week. You could look through the mail message to see if it carries a message-id of the format that comes from your mail server. If the bounce includes the headers of the original mail (some, particularly "virus intercepted" ones don't), you could look through those to see if your IP was mentioned. You could parse the logs of your mail server and make a note of IDs of outgoing messages You could allow bounces in for, say, 10 minutes after you send a message (NOT recommended). Personaly, I'd parse the logs and look for valid Message-IDs. But it's not foolroof.	[reply]
Re: Detecting bounced mails by exussum0 (Vicar) on Mar 07, 2004 at 21:10 UTC
There is NO standard on headers in the body of an email that everyone adheres to. So a catchall will be hard to do. for instance, try telnetting to port 25 on some random mail server, with a standard ehlo, from,to (fake it of course to get a bounce) and a blank line for the data portion. You'll see that it goes through fine sending out email, and you should expect no less from any mail coming to you, from a user or a server. What you can count on statistically is, that there a large handful of servers and how they send out MAILER-DAEMON like errors. Qmail likes to whine about the bounces bouncing, and sendmail with it's, i tried for 5 days type thing. My advice is, to find people who run qmail, sendmail, exchange, yahoo.com's, msn.com's and a bunch of others and learn how their bouceback messages look. I can't say they all do one specific thing, as it is up to them to dictate how things work on their end. But good choice on languages though. Perl/PERL/perl is a good choice of languages to do this type of work with.	[reply]
Re: Detecting bounced mails by Vautrin (Hermit) on Mar 07, 2004 at 20:42 UTC
What e-mail program do you use? It seems to me that it would be much more reliable to have every email you send registered in a database (i.e. whoever you're emailing to), and if you get a bounce from that domain within 24-48 hours you let it through, otherwise you drop the email from your bounced list. There are, unfortunately, more then just `MAILER-DAEMON@*` daemons out there. Has that proved a problem for you? Want to support the EFF and FSF by buying cool stuff? Click here.	[reply] [d/l]
Re: Re: Detecting bounced mails by Corion (Patriarch) on Mar 07, 2004 at 20:53 UTC
Currently I'm storing all my outgoing mail in an mbox file, so I could also periodically scan that file, or maybe scan my exim log for the outgoing message IDs. But first, I need to determine if an incoming mail is a bounce before I can decide whether I want to see it or not... I did test my module against the mails of yesterday and the two days before, and it finds all bounces. More I can't say :-)	[reply]
Re: Detecting bounced mails by TomDLux (Vicar) on Mar 07, 2004 at 23:56 UTC
The reason you are getting all these bounces is because some spammer is impersonating you. I had lots of bounce messages, but could not prove anything until A mail system sent me the complete bounced message, headers and all. It was being sent from System X.com to Somewhere.ru, with my address in the From: slot. But the headers showing arrival time at the various hosts along the way were none of them my ISP. I set up a mail filter to drop bouncees. -- `TTTATCGGTCGTTATATAGATGTTTGCA`	[reply]
•Re: Detecting bounced mails by merlyn (Sage) on Mar 09, 2004 at 02:17 UTC
Why are you reinventing Mail::DeliveryStatus::BounceParser? -- Randal L. Schwartz, Perl hacker Be sure to read my standard disclaimer if this is a reply.	[reply]
Re: •Re: Detecting bounced mails by Corion (Patriarch) on Mar 09, 2004 at 07:56 UTC
I knew there already was such a module, but I looked (twice) through CPAN and didn't find it, so I rewrote it (and it wasn't hard to do). Thanks for finding it for me! Now I'll simply turn my module into some glue and use the other module `:-)`	[reply]
Re^3: Detecting bounced mails by Anonymous Monk on Sep 07, 2004 at 13:19 UTC
I am new to all of this but need similar help. I found the Mail::DeliveryStatus::BounceParser but do not understand how I might have it parse my inbox to provide the email and reason the email rejected.	[reply]
Re^4: Detecting bounced mails by Corion (Patriarch) on Sep 07, 2004 at 13:28 UTC