or... How to Eat MIMEs

This is just a simple subroutine which will strip an email with a Content-Type of Multipart/Alternative/ of any parts which aren't text/plain. If you already have a script filtering your mail, just plug this in wherever you usually scan for stuff and pass it a reference to a string containing your mail message (headers and all). You might need to remove the byte and unbyte lines, depending... all I know is that I need it.

Personally I just plug it into McD's mail proxy, A SpamAssassin-Enabled POP3 Proxy, in place of the sub of the same name, which normally runs the spam filter. If you go that route, make sure to grab the pop3proxy.zip that he links to, if for nothing else, then the killproxy.pl, which is a quick and easy way to... well... kill the proxy.

Oh yes, you'll also need to use Email::Simple . (You could alternatively use a couple of the Mail:: modules, you'll just have to tweak the code a bit first.)

Happy Filtering.

sub scan_mail { my $mailref = shift; $$mailref =~ s/\012\.\./\012\./g; # un-byte-stuff my $mail = Email::Simple->new($$mailref); if ($mail->header("Content-type") =~ /Multipart\/Alternative/i) { # seperate the parts my ($boundary) = $mail->header("Content-type") =~ /boundary=\"(.*? +)\"/; my $body = $mail->body; my @part = split /\-\-$boundary\n/, $body; #find the part that's plain text foreach (@part) { next if (!/text\/plain/) ; $body = $_ ; last; } #Change the messages headers and remove them from the message # This has gotten a bit ugly again, but it works again while ($body =~ /(^Content.*?): (.*)\n/mig) { my ($header,$content, $extra_content) = ($1,$2); if ($content =~ /\;/) { ($extra_content) = $body =~ /\G^(.*)$/m; } $mail->header_set($header, $content.$extra_content); $body =~ s/$header: $content\n$extra_content//; } #finish up and put things back where they go $mail->body_set($body); } $$mailref = $mail->as_string; $$mailref =~ s/\012\./\012\.\./g; # byte-stuff }

update:

Made the initial check for multipart messages case insensitive, and made the grabbing of headers from the message a little more RFC-1521 friendly. Though it may slow filtering down a bit. (( I realize this could be done much easier with MIME::Tools, as merlyn's article mentions, but this initially came about as a learning process, and I'd rather leave it basically the same, except for minor fixes, which means no MIME::Tools ))

update2:

My upgrades of the code to Grab the headers from the body and then erase them, wasn't working as I expected, and I had to add a piece that would catch a case of additional parameters occuring after a line-break

Just Another Perl Alchemist

Replies are listed 'Best First'.
•Re: Convert Multipart Email to plain text
by merlyn (Sage) on Mar 17, 2004 at 20:30 UTC

      I made use of the two MIME:: modules mentioned in the column, and they worked well... most of the time. For certain messages, it would cause my proxy to never pass the message along to my mail client. So until I can track down the problem, I am back to doing it with my handrolled solution. But nonetheless, your article gave me a good headstart on the usage of those two modules, and the rest of the MIME:: family, thanks. (That and it's nice to not be the only one to have made cheesey MIME puns)

      Just Another Perl Alchemist
        Assuming you've followed along and managed to get a valid $entity (isa MIME::Entity), the following example from the manpage of the same name is pretty nifty:

        # Only keep text parts my @keep = grep { $_->effective_type =~ m|^text/| } $entity->parts; $entity->parts(\@keep);
        Peace,
        -McD
Re: Convert Multipart Email to plain text
by McD (Chaplain) on Mar 18, 2004 at 23:37 UTC
    Nice!

    I hadn't thought about it, but I suppose that proxy could be used for any number of nifty mail tricks, esp. on Win32.

    Peace,
    -McD