I have a rather involved perl script that reads an email from STDIN using MIME::Parser and a few other libraries to split apart an email into its different components for writing to a database. The problem is that it now appears some emails have subjects that span two lines, and MIME::Parser doesn't appear to be able to process it properly.
use strict; use Advisory; use MIME::Parser; use MIME::Entity; use MIME::WordDecoder; use MIME::Tools; use MIME::Decoder; use Email::MIME; my $parser = MIME::Parser->new; + + + $parser->extract_uuencode(1); $parser->extract_nested_messages(1); $parser->output_to_core(1); my $buf; while(<STDIN> ){ $buf .= $_; } my @mailData = split( '\n', $buf); my $entity = $parser->parse_data($buf); my $subject = $entity->head->get('Subject'); my $from = $entity->head->get('From'); my $AdvDate = $entity->head->get('Date'); my $linecount = 0; my $inadvis = 0; foreach my $line (@mailData) { chomp($line); if ($line =~ m/^Description:/) { $startShort = "true"; next; } if ($linecount lt 5 && $startShort eq "true") { $shortDesc .= $line . " "; $linecount++; } #print "line: $line\n"; # MGASA-2018-0463 - Updated roundcubemail packages fix securit +y vulnerability & bugs # MGASA-2018-0439 - Updated ansible package fixes security vul +nerabilities #print "line: $line\n"; if ($line =~ m/MGASA-(\d+)-(\d+) - Updated (.*) package/) { $pkgname = $3; $subject = "Mageia $1-$2: $3 security update"; $inadvis = 1; } # MGASA-2019-0151 - Virtualbox 6.0.6 fixes security vulnerabil +ities elsif ($line =~ m/MGASA-(\d+)-(\d+) - (.*) (.*) fixes security + vulnerabilities/) { $pkgname = $3; $subject = "Mageia $1-$2: $3 security update"; $inadvis = 1; } # [updates-announce] MGASA-2023-0355: New chromium-browser-sta +ble 120.0.6099.129 fixes bugs and vulnerabilities elsif ($line =~ m/MGASA-(\d+)-(\d+): New (.*) (.*) fixes bugs +and/) { $pkgname = $3; $subject = "Mageia $1-$2: $3 security update"; $inadvis = 1; } if ($inadvis == 1) { $advisory .= $line . "\n"; } }
The example subject I'm working with that spans two lines is as follows:
Subject: [updates-announce] MGASA-2023-0355: New chromium-browser-stab +le 120.0.6099.129 fixes bugs and vulnerabilities
I've put the entire mbox email in a paste here

https://pastebin.com/LRs9J4pd

Any ideas greatly appreciated.


In reply to MIME::Parser and multi-line subjects by gossamer

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.