peteredhair has asked for the wisdom of the Perl Monks concerning the following question:

Hi Monks,

i'm posting in search for help as i'm a newbie to perl and i'm returning to development recently to solve some management problems.
I setup and have Navios checking services at my data processing site. Navios works great and does it's simple job very efficiently, when there's a problem it'll raze the flag.

However i have a lot of applications, and app webservices, for which the only visibility i have is an email they send reporting their status or job/processing result status, for instance, an app report that it found 200 files to process, or received 200 web services requests and there was an error on 15 of them.
They report this in a HTML email with a table of the corresponding files or requests to process and how many had error.

I need to read these emails, find which ones correspond to problems and then summarise this as a report or inject this to Nagios.

I need help in processing the emails headers and body as some of them come encoded in uff8 or other forms of encoding (and also a lot of html in the body), a specific exemple is for example for the subject or from header field:

Subject: =?UTF8?B?5LuO5Y2a5a6i5paH56ug5Lit5p+l5om+5oKo5oSf5YW06Laj55qE5Li7?==?U +TF-8?B?6aKY?= =?UTF8?B?it5p+l5om+5oKo5oSf5YW06Laj55?=

The header returns this in separate lines.

i'm learning perl by example and couldn't find a simple but fast solution to this.

However found a python sample code that will do all this at https://github.com/akkana/scripts/blob/master/decodemail.py Is there something similar already done for perl ? it'll solve my first big issue and allow me to move on to the other parts of my app, and learn in the way

Sorry for the long post and thanks in advance.

Peteredhair

Replies are listed 'Best First'.
Re: eMail processing for alarmistic
by hippo (Archbishop) on Apr 28, 2018 at 12:21 UTC
    i'm learning perl by example and couldn't find a simple but fast solution to this.

    Email::MIME will handle the decoding for you either for one header or for all of them. Here's a quick demo of both.

    #!/usr/bin/env perl use strict; use warnings; use Email::MIME; use Encode; my $raw = <<'EOT'; Subject: =?UTF8?B?5LuO5Y2a5a6i5paH56ug5Lit5p+l5om+5oKo5oSf5YW06Laj55qE5Li7?==?U +TF-8?B?6aKY?= =?UTF8?B?it5p+l5om+5oKo5oSf5YW06Laj55?= To: larry@perl.org From: peteredhair@perlmonks.org Hi Larry! EOT my $email = Email::MIME->new ($raw); # Just one my $subj = $email->header ('Subject'); print "Single header subject is '", encode ("utf-8", $subj), "'\n\n"; # Or all of them my @headers = $email->header_str_pairs (); while (@headers) { my $key = shift @headers; my $val = shift @headers; print "$key: ", encode ("utf-8", $val), "\n"; }

    PS. This is essentially hinted at in the FAQ How do I parse a mail header?

      Thank you very much for your reply and help
      That did it and i've already integrated it in my code

      I wouldn't get there without your help as there are too many modules and ways to parse and decode mail.
      Now i can move on to the rest of the process

      When i end this program i'll post it if someone needs or to help others like me that are starting

      Thanks
      Peteredhair

      Since your reply i complemented my code a little bit, found a few problems, searched a bit more and composed a solution based on other's code too.

      So if this may help other newcomers i'm posting the code i've copy/pasted and built so far.

      This program accepts a outlook .msg file as an argument and prints the readable text parts of the email i need

      #!/usr/bin/perl -w use strict; use warnings; use Email::MIME; use Email::Address::XS; use Encode; use Email::Outlook::Message; for my $filename ( glob("$ARGV[0]*") ) { # will create an Email::Mime object from .msg file my $msg = new Email::Outlook::Message $filename or die "Can't ope +n $filename: $!\n"; my $msg_mime = $msg->to_email_mime; #print "Mime parts: " . $msg_mime->as_string, "\n"; my ($from) = Email::Address::XS->parse($msg_mime->header ('From')) +; my ($to) = Email::Address::XS->parse($msg_mime->header ('To')); my $subject = encode ("utf8", $msg_mime->header ('Subject')); my $date = $msg_mime->header ('Date'); print "FROM: ", $from, "\n"; print "TO: ", $to, "\n"; print "SUBJECT: ", $subject, "\n"; print "DATE: ", $date, "\n\n"; ## tell the filename reading # print 'Filename: ', $filename, "\n"; my (@mailData, $body); $msg_mime->walk_parts(sub { my ($part) = @_; # warn($part->content_type . ": " . $part->subparts); if (($part->content_type =~ /text\/plain; charset=\"?utf-8\" +?/i) && !@mailData) { @mailData = split( '\n', $part->body); } elsif (($part->content_type =~ /text\/plain; charset=\"?us-a +scii\"?/i) && !@mailData) { @mailData = split( '\n', $part->body); } elsif (($part->content_type =~ /text\/plain; charset=\"?wind +ows-1252\"?/i) && !@mailData) { @mailData = split( '\n', $part->body); } elsif (($part->content_type =~ /text\/plain; charset=\"?iso- +8859-1\"?/i) && !@mailData) { @mailData = split( '\n', $part->body); } }); #print $part->body; #only need utf-8 for this test foreach my $line (@mailData) { print encode ("utf8", $line) . "\n"; } }