in reply to Re^2: MIME voodoo.
in thread MIME voodoo.

Your input isn't correct MIME message. There should be header that defines boundary and it's missed. Here's some small and dirty (sorry...) example:

use strict; use warnings; use 5.010; use MIME::Lite; use Email::MIME; my $msg = MIME::Lite->new( From => 'src@example.com', To => 'dst@example.com', Subject => 'message', Type => 'multipart/alternative', ); $msg->attach( Type => 'text/plain', Data => 'this is text content', ); $msg->attach( Type => 'text/html', Data => 'this is <b>html</b> content', ); my $msg_str = $msg->as_string; print $msg_str; # note the output here -- that's a complete message my $parsed = Email::MIME->new($msg_str); say "*" x 50; if ($parsed->content_type =~ m{^multipart/alternative}) { say get_text_parts($parsed)->body; } sub get_text_parts { my @parts = shift->parts; my %ct; $ct{$_->content_type} = $_ for @parts; return $ct{'text/plain'} if exists $ct{'text/plain'}; return $ct{'text/html'} if exists $ct{'text/html'}; return $parts[0]; }

Upd: minor fix

Upd: Note also that $_->content_type may return something like text/plain; charset=utf-8, and this code will fail in this case.

Replies are listed 'Best First'.
Re^4: MIME voodoo.
by vxp (Pilgrim) on Jul 16, 2009 at 19:37 UTC

    This works, and doesn't work - at the same time.

    I think an explanation is due after a statement like that, so here goes:

    Take this code:

    #!/usr/bin/perl use Email::MIME; $file = shift; $which = shift; ############################## # $which is: # text = plain text portion # html = html portion ############################## local( $/, *FILE ) ; open(FILE, $file); $message = <FILE>; close(FILE); my $parsed = Email::MIME->new($message); if ($parsed->content_type =~ m{^multipart/alternative}) { print get_text_parts($parsed)->body; } sub get_text_parts { my @parts = shift->parts; my %ct; $ct{$_->content_type} = $_ for @parts; return $ct{'text/plain'} if exists $ct{'text/plain'}; return $ct{'text/html'} if exists $ct{'text/html'}; return $parts[0] if $which =~ /text/; return $parts[1] if $which =~ /html/; }

    And also take this input:

    Content-Type: multipart/alternative; boundary="_000_200907060005UAA14932pisas291mscom88clm_" MIME-Version: 1.0 --_000_200907060005UAA14932pisas291mscom88clm_ Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable line1 line2 line3 --_000_200907060005UAA14932pisas291mscom88clm_ Content-Type: text/html; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable <p>blah</p> <p>blah2</p> --_000_200907060005UAA14932pisas291mscom88clm_-- --_004_38D25DCAD7370B4FACA079E2FAA2C690B02CB5NYWEXMB24msadmsco_--

    When you run it, the results are as follows (this is the "WORKS" part):

    $ ./grab.pl mime4 text line1 line2 line3 $ ./grab.pl mime4 html <p>blah</p> <p>blah2</p> $

    Now, what does NOT work is that as I said in my original request - sometimes the plain text mime can be on top, sometimes its on the bottom. using this code, if you switch the two mimes around it doesn't work. try switchign them around, and tell the code to give you the html portion. it'll give you the plain text portion instead.

    Is there any way to specifically request a html or text portion (so it doesn't matter what order the MIMEs are in the input), to your knowledge?

      if you switch the two mimes around it doesn't work

      Note my last comment for the example. There's a bug, as $_->content_type returns not just text/plain, but text/plain; charset=.... So you should replace

      $ct{$_->content_type} = $_ for @parts;
      with
      for (@parts) { (my $c = $_->content_type) =~ s/;.+//; $ct{$c} = $_; }

      but this solution thought would work for most messages isn't perfect either, as it is possible to have several text/plain parts with different charsets.

        Ah! I see now - you're getting rid of the semicolon and everything that follows, so you're only left with text/plain or text/html. I see. :)

        Any ideas why it only returns the plain text portion, and no html?

        sub get_text_parts { my @parts = shift->parts; my %ct; # $ct{$_->content_type} = $_ for @parts; for (@parts) { (my $c = $_->content_type) =~ s/;.+//; # print "\nDEBUG: $c\n"; $ct{$c} = $_; } return $ct{'text/plain'} if exists $ct{'text/plain'}; return $ct{'text/html'} if exists $ct{'text/html'}; return $parts[0] if $which =~ /text/; return $parts[1] if $which =~ /html/; }

        outputs:

        $ ./grab.pl mime4 html line1 line2 line3 $ ./grab.pl mime4 text line1 line2 line3 $

        I really appreciate you taking the time to help out, thanks a lot :)