cez has asked for the wisdom of the Perl Monks concerning the following question:

Is there a module/script out there that deals with MIME laden email and "strips" out the textual message for display? (and is smart enough to prefer the plaintext, or strips html tags, etc) thanks

Replies are listed 'Best First'.
Re: email and MIME?
by grinder (Bishop) on Sep 25, 2001 at 12:01 UTC

    Yes, you just need to install the Mimetools bundle from CPAN. We use Lotus Notes at work, and from time to time it barfs up on a MIME encoded message and is unable to pull it apart. In that case I save the file out as text and then run the following code on it, which dumps out the different sections into seperate files.

    #! /usr/bin/perl -w use strict; use MIME::Parser; my $parser = new MIME::Parser; my $entity = $parser->parse(\*STDIN) or die "parse failed\n"; $entity->dump_skeleton;

    That code is straight out of the documentation. For cleaning up the HTML, you might want to consider crafting something manually with HTML::Parser (a good learning exercise).

    At first I planned on doing exactly what you suggest, but this cheap hack was "good enough," so I let that idea slide.

    --
    g r i n d e r
Re: email and MIME?
by projekt21 (Friar) on Sep 25, 2001 at 12:34 UTC

    In addition to grinder I would like to advice you to have a further look at MIME::Entity.

    Using its methods mime_type resp. effective_type on each MIME part you may find the "text/plain" part and process it.

    You may use these methods on MIME::Parser objects.

    ############################################################ sub split_entity { ############################################################ local $entity = shift; # needs a MIME::Entity object my $num_parts = $entity->parts; # how many mime parts? if ($num_parts) { # we have a multipart mime message foreach (1..$num_parts) { split_entity( $entity->parts($_ - 1) ); } } else { # we have a single mime message/part # text if ($entity->effective_type =~ /^text\/(?!(html|enriched))/) { # do something } else { # do something different } } }

    alex pleiner <alex@zeitform.de>
    zeitform Internet Dienste

      thank you for the replies

      I guess thats a 'no' on the existing scripts, I'm a little surprised there isn't already some perl module to deal with these things a little more intelligently..

      the problem is, email includes some fun surprises like nested messages (with full headers) that most existing email readers are smart enough to understand, and process/display correctly. This isn't really a big deal, since we're not dealing with something very complicated here.. but this is a personal project, so I was hoping there was something already out there..

      thanks again

        Don't be that pessimistic. The MIME-Tools and the above sub do a great job for me separating text and non-text MIME components. As long as the mailer that sent the mail has ever heard about the relevant RFCs it is easy to handle.

        If you don't believe, have a look at this webmailer (in German, but guessable) using the above sub.

        For the fun surprises, notice the recursive call of that sub.

        alex pleiner <alex@zeitform.de>
        zeitform Internet Dienste