I'm at the "receiving end" of the mail wire: i receice mails in my (MS Windows Exchange) inbox encoded in standard mail/MIME format.

I'm interested in the text part of the body of these mails, that is: "what follows the mail header" (ie. the From:, Sent: To: Issue: stuff). The body contains haiku entries, that i parse and reshuffle into a voting list, and subsequently rank according to received votes, -- but the app as such is not that interesting in this context.
An example of the text part of a mail message is:

[author] xxx yyy [1] clouds . . . the distance blossoming between two crows [2] a morning without incident dead fly [3] sunrise ceremony the holy man's third eye bloodshot [4] morning dew bell bottoms darkened by mayflies

This is what i'm interested in parsing out, and this is the text part of the message, that is displayed in the mail client (in casu: MS Outlook).

The problem is, that the above text is not what i get from the mail body handed over by the the mentioned MIME modules. Instead i get the full mail body segment, including binary MIME encodings and HTML tagging.

So i have to do some filtering to get at the text "payload", that i need for the app. Now i was wondering, if anybody had already wrapped this functionality into a function, possibly in a MIME module. That was my question

I haven't worked with email before, so maybe i'm simply overlooking som basic assumptions about the MIME format & parsing...
-- allan

In reply to Re^2: Extracting TEXT from email by ady
in thread Extracting TEXT from email by ady

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.