This is just a simple subroutine which will strip an email with a Content-Type of Multipart/Alternative/ of any parts which aren't text/plain. If you already have a script filtering your mail, just plug this in wherever you usually scan for stuff and pass it a reference to a string containing your mail message (headers and all). You might need to remove the byte and unbyte lines, depending... all I know is that I need it.
Personally I just plug it into McD's mail proxy, A SpamAssassin-Enabled POP3 Proxy, in place of the sub of the same name, which normally runs the spam filter. If you go that route, make sure to grab the pop3proxy.zip that he links to, if for nothing else, then the killproxy.pl, which is a quick and easy way to... well... kill the proxy.
Oh yes, you'll also need to use Email::Simple . (You could alternatively use a couple of the Mail:: modules, you'll just have to tweak the code a bit first.)
Happy Filtering.
sub scan_mail { my $mailref = shift; $$mailref =~ s/\012\.\./\012\./g; # un-byte-stuff my $mail = Email::Simple->new($$mailref); if ($mail->header("Content-type") =~ /Multipart\/Alternative/i) { # seperate the parts my ($boundary) = $mail->header("Content-type") =~ /boundary=\"(.*? +)\"/; my $body = $mail->body; my @part = split /\-\-$boundary\n/, $body; #find the part that's plain text foreach (@part) { next if (!/text\/plain/) ; $body = $_ ; last; } #Change the messages headers and remove them from the message # This has gotten a bit ugly again, but it works again while ($body =~ /(^Content.*?): (.*)\n/mig) { my ($header,$content, $extra_content) = ($1,$2); if ($content =~ /\;/) { ($extra_content) = $body =~ /\G^(.*)$/m; } $mail->header_set($header, $content.$extra_content); $body =~ s/$header: $content\n$extra_content//; } #finish up and put things back where they go $mail->body_set($body); } $$mailref = $mail->as_string; $$mailref =~ s/\012\./\012\.\./g; # byte-stuff }
update:
Made the initial check for multipart messages case insensitive, and made the grabbing of headers from the message a little more RFC-1521 friendly. Though it may slow filtering down a bit. (( I realize this could be done much easier with MIME::Tools, as merlyn's article mentions, but this initially came about as a learning process, and I'd rather leave it basically the same, except for minor fixes, which means no MIME::Tools ))
update2:
My upgrades of the code to Grab the headers from the body and then erase them, wasn't working as I expected, and I had to add a piece that would catch a case of additional parameters occuring after a line-break
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
•Re: Convert Multipart Email to plain text
by merlyn (Sage) on Mar 17, 2004 at 20:30 UTC | |
by Koosemose (Pilgrim) on Mar 19, 2004 at 18:03 UTC | |
by McD (Chaplain) on Mar 19, 2004 at 19:27 UTC | |
|
Re: Convert Multipart Email to plain text
by McD (Chaplain) on Mar 18, 2004 at 23:37 UTC |