BernieC has asked for the wisdom of the Perl Monks concerning the following question:

I"m lost in a bunch of twisty passages all in the Email:: world :o). my problem is {I think} very simple but the Email modules seem much more focused on creating/modifying messages and I can't see how just to *examine* a message.

What I want to do a mock-email-reader. This means I need to parse out the headers {I just need things like from/to/subject/date} and then find the "body" of the message. There seem to be three types of incoming emails one is plain text, another in plain html {that is no multipart but just HTML.. I got one just today:

X-CMAE-Envelope: MS4xfLUIIc3gwFFCUTu1+RYnII5snX2pyaUrABakvIQ567LlL7RBF +Ly4Wo65N93eCIInGj50aDn6TLwhXwJbk7HKUHu2pUzH8OWeKTJoF2xE/w3tkTQrR8cj Kh4gBf/TMflzvBVgeRGN7++n/ZIwr/endxydKhxB1KRKrAoSBcA1O3+KsH4dy7QKym+yU +9SP+8B9fQ== X-PMFLAGS: 34095744 0 65537 PQVHWQ2O.CNM X-CC-Diagnostic: Body contains "click here" (20) <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http:/ +/www.= w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> <html xmlns:o=3D"urn:schemas-microsoft-com:office:office" xmlns:v=3D"u +rn:sc= hemas-microsoft-com:vml">=20 <head> <!--[if gte mso 9]><xml><o:OfficeDocumentSettings><o:AllowPNG/ +><o:P= ixelsPerInch>96</o:PixelsPerInch></o:OfficeDocumentSettings></xml><![e +ndif]= -->=20
and of course a multipart message {in which case I'd want the HTML part}.

This feels like it should be easy but so much of Email::* is occupied about modifying/adding/MIMEing, etc that I can't separate out the simple "parse and extract" machinery I need. Any advice/guidance/tutorial? THANKS

Replies are listed 'Best First'.
Re: Parsing an email
by kcott (Archbishop) on Jan 09, 2022 at 08:25 UTC

    G'day BernieC,

    Take a look at Courriel. I haven't used it myself; however, its documentation indicates it has straightforward methods that return all the things you want (from(), to(), subject(), plain_body_part(), html_body_part(), and so on).

    It's been around for over a decade with many updates (see Changes) over the years, the last being just a few months ago. Its author, Dave Rolsky, is a well-known and respected CPAN contributor.

    — Ken

      Mail::Box looks like it'll work but Courriel looks like exactly what I wanted: "This class exists to provide a high level API for working with emails, particular for processing incoming email." Thanks!!
        Dumb question -- I did a "cpan i Courriel" and it seems to have installed without its documentation:
        D:\>perldoc Courriel No documentation found for "Courriel".
        I've downloaded the tar.gz for it and I've pawed through it and I can't see anything that looks like a .pod or .1 file in it.. so I dunno what to do next.
Re: Parsing an email
by Fletch (Bishop) on Jan 09, 2022 at 04:13 UTC

    I'd used Mail::Box working with a maildir folder or three (pulled down with offlineimap) moving things around based on headers. Might do what you need.

    The cake is a lie.
    The cake is a lie.
    The cake is a lie.