Sanity Check has asked for the wisdom of the Perl Monks concerning the following question:

My programming skills are intermediate at best, and I have not used Perl much before, so please reply gently.

I'm trying to extract the original "from address" (NOT the "envelope-from address") from inbound emails.

I parse inbound emails that pass through MailScanner software on my server. If I write (using MailScanner's built-in message object):

my($message) = @_; MailScanner::Log::InfoLog("from address: @{$message->{headers}}");

I get the following log entry (sanitized):

Received: from [192.168.12.34] (port=56309 helo=theirserver.theirdomain.tld) by server.mydomain.tld with esmtp (Exim 4.86) (envelope-from ) id 1aG62o-0002ad-Hu for recipient@mydomain.tld; Mon, 04 Jan 2016 09:23:34 -0500 Received: from 00a657f7.theirserver.theirdomain.tld ([127.0.0.1]:8056 helo=theirserver.theirdomain.tld) by theirserver.theirdomain.tld with ESMTP id 00PA657MF7; for ; Mon, 4 Jan 2016 06:22:53 -0800 Date: Mon, 4 Jan 2016 06:22:53 -0800 To: Message-ID: <70562391089443970564001376171645@theirserver.theirdomain.tld> From: "Sender" Subject: test Content-Language: en-us MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Content-Type: multipart/alternative; boundary="----=Part.960.1818.1451917373"

If I write (based on a suggestion by MailScanner's author):

my($message) = @_; my $from_address = grep /^From:\s+/i, @{$message->{headers}}; MailScanner::Log::InfoLog("from address after grep = $from_address ");

I get the following log entry:

from address after grep = 0

Not sure what to do with that result, I try using Data::Dumper via a MailScanner compatible script I found online and produce the following result:

$VAR1 = bless( { 'nameinfected' => 0, 'otherinfected' => 0, 'disarmedt +ags' => [], 'othertypes' => {}, 'file2entity' => { '' => bless( { 'ME +_Parts' => [ bless( { 'ME_Bodyhandle' => bless( { 'MB_Path' => '/var/ +spool/MailScanner/incoming/9365/1aG62o-0002ad-Hu/nmsg-9365-3.txt' }, +'MIME::Body::File' ), 'ME_Parts' => [], 'mail_inet_head' => bless( { +'mail_hdr_foldlen' => 79, 'mail_hdr_modify' => 0, 'mail_hdr_list' => +[ 'Content-Transfer-Encoding: 8bit ', 'Content-Type: text/plain; char +set="UTF-8" ' ], 'mail_hdr_hash' => { 'Content-Type' => [ \$VAR1->{'f +ile2entity'}{''}{'ME_Parts'}[0]{'mail_inet_head'}{'mail_hdr_list'}[1] + ], 'Content-Transfer-Encoding' => [ \$VAR1->{'file2entity'}{''}{'ME_ +Parts'}[0]{'mail_inet_head'}{'mail_hdr_list'}[0] ] }, 'mail_hdr_mail_ +from' => 'KEEP', 'mail_hdr_lengths' => {} }, 'MIME::Head' ) }, 'MIME: +:Entity' ), bless( { 'ME_Bodyhandle' => bless( { 'MB_Path' => '/var/s +pool/MailScanner/incoming/9365/1aG62o-0002ad-Hu/nmsg-9365-42.html' }, + 'MIME::Body::File' ), 'ME_Parts' => [], 'mail_inet_head' => bless( { + 'mail_hdr_foldlen' => 79, 'mail_hdr_modify' => 0, 'mail_hdr_list' => + [ 'Content-Transfer-Encoding: 8bit ', 'Content-Type: text/html; char +set="UTF-8" ' ], 'mail_hdr_hash' => { 'Content-Type' => [ \$VAR1->{'f +ile2entity'}{''}{'ME_Parts'}[1]{'mail_inet_head'}{'mail_hdr_list'}[1] + ], 'Content-Transfer-Encoding' => [ \$VAR1->{'file2entity'}{''}{'ME_ +Parts'}[1]{'mail_inet_head'}{'mail_hdr_list'}[0] ] }, 'mail_hdr_mail_ +from' => 'KEEP', 'mail_hdr_lengths' => {} }, 'MIME::Head' ) }, 'MIME: +:Entity' ) ], 'ME_Epilogue' => [ ' ' ], 'ME_Preamble' => [], 'mail_in +et_head' => bless( { 'mail_hdr_foldlen' => 79, 'mail_hdr_modify' => 0 +, 'mail_hdr_list' => [ 'Received: from [192.168.12.34] (port=56309 he +lo=theirserver.theirdomain.tld) by server.mydomain.tld with esmtp (Ex +im 4.86) (envelope-from ) id 1aG62o-0002ad-Hu for recipient@mydomain. +tld; Mon, 04 Jan 2016 09:23:34 -0500 ', 'Received: from 00a657f7.thei +rserver.theirdomain.tld ([127.0.0.1]:8056 helo=theirserver.theirdomai +n.tld) by theirserver.theirdomain.tld with ESMTP id 00PA657MF7; for ; + Mon, 4 Jan 2016 06:22:53 -0800 ', 'Date: Mon, 4 Jan 2016 06:22:53 -0 +800 ', 'To: ', 'Message-ID: <70562391089443970564001376171645@theirse +rver.theirdomain.tld> ', 'From: "Sender" ', 'Subject: Test ', 'Conten +t-Language: en-us ', 'MIME-Version: 1.0 ', 'Content-Transfer-Encoding +: 8bit ', 'Content-Type: multipart/alternative; boundary="----=Part.9 +60.1818.1451917373" ' ],
... and so on.

So I next try to parse mail_hdr_list with the following:

my($message) = @_; MailScanner::Log::InfoLog("SpamWhitelist $msgid: mail_hdr_list @{$mess +age->{headers}}[mail_hdr_list]");

and I get this result:

Received: from server.theirdomain.tld (192.168.165.54:49620 helo=server.theirdomain.tld)

I'm perplexed. I can't figure out how to get the From: address from this object, but not the envelope-from address.

Any help rewriting my code would be greatly appreciated.

Replies are listed 'Best First'.
Re: Parse Email Header
by jdv (Sexton) on Jan 04, 2016 at 19:09 UTC

    Note that when you write:

    my $from_address = grep /^From:\s+/i, @{$message->{headers}};

    grep is in scalar context, so the return value is the number of matches and not the matches themselves. I think you probably want:

    my $from_address = ( grep /^From:\s+/i, @{$message->{headers}} )[0];

    which puts it in list context and then returns the value of the first match. However, the return value was 0 so there were no matches anyway. Can you provide the Data::Dumper output for $message->{headers}? I'm not sure what the dump you posted is from but it looks like probably some higher-level object. The regex looks okay so $message->{headers} must not containn what you think it does.

Re: Parse Email Header
by NetWallah (Canon) on Jan 04, 2016 at 18:59 UTC
    It is hard to tell what
    @{$message->{headers}}
    contains because you did not enclose the output in <code/> tags.

    In particular, it is not clear that the "From:" line starts at a newline boundary.

    If it does, the correct way to capture the content is:

    my ($from_address) = grep /^From:\s/i, @{$message->{headers}}; #^ ^ Parens added to create list context, to get CONTEN +T rather than COUNT
    It would also help if you identified what variable was dumped using Data::Dumper, and had that formatted better , using <code> tags.

            "I can cast out either one of your demons, but not both of them." -- the XORcist

A reply falls below the community's threshold of quality. You may see it by logging in.