Since I am not graced with a scripting brain, and I'm not that great with Perl, I'm having a number of issues trying to report on (Postfix) mail that has originated from certain hosts in our domain destined for other specific domains. I have the regexps working (they're probably very ugly, but who cares), but I have problems associating the two message ids that each message has due to our antispam solution, while performing separate tests on them.
The line in the maillog that mentions both message ids is like this:
Jun 7 13:47:56 smtpserver postfix/smtp[16346]: A9507208022: to=<a.use +r@somwhere.gov>, relay=localhost[127.0.0.1], delay=0, status=sent (25 +0 Ok: queued as B68C8208095)
The 11-12 character hex-looking strings are the message ids. In an earlier line of the maillog, the first message ID tells me which client/host that the message originated from:
Jun 7 13:47:56 smtpserver postfix/smtpd[12725]: A9507208022: client=o +urhost1.our.domain[172.111.111.111]
The second message ID eventually shows up again in a maillog line which shows the message being accepted by the destination server:
Jun 7 13:47:57 smtpserver postfix/smtp[12379]: B68C8208095: to=<a.use +r@Somewhere.gov>, relay=server.somwhere.gov[10.0.100.100], delay=1, s +tatus=sent (250 ok: Message 2156237 accepted)
If a message ID has not originated from the correct host, I want to discard both associated IDs. If it has originated from the correct hosts, I want to want to print the line that shows the delivery. I've gotten as far as figuring out I need to run through the maillog a few times to gather the MSGIDs, discard the ones from the incorrect hosts, and finally print the lines I want. I assume slurping each 75MB file might be a bit much (although I'd be happy to try it).
In short, how do I do the magic in the middle? Is the best thing to use a hash to hold both message IDs? Do I use the first message ID as the "key", and delete it if it's from the wrong client? ie:
foreach $line ( <FH> ) { if ( $line !~ /to=<.*(our\.domain)/i) { # our external domain is a subset of the ones we're looking for if ($line =~ /to=<.*\.gov>.*relay=localhost/i) { @ids = ($line =~ /\ ([0-9A-F]{11,12})/g); %msgids{$ids[0]} = $ids[1]; } } } open (FH2, $file ) or die "Cannot open file 2nd time: $!"; foreach $line ( <FH2> ) { if ( $line !~ /client=(ourhost1|ourhost2)/ ) { foreach $keys (%msgids) { if ($key =~ /$line/) { delete $msgids $key; } } } # then do something to run through the log again and print # the delivery report line from the second msgid
Is it better to do a while to run through the IDs doing the rest of the matching or the foreach or what? Or is it a TIMTOWDI situation?
TIA for sanity-checking and input.
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |