sebastiannielsen2 has asked for the wisdom of the Perl Monks concerning the following question:

Im trying to split a scalar containing multiple MIME header lines, that might be folded, into a Array that should hold one complete header per element

I tried with the following:

@fixedheaders = split(/\n\S/, $fixedmsgheader);

But that eats the first char in header lines. I need to split the header line so it takes folded header lines in consideration and NOT splitting a folded header line in the middle.

Replies are listed 'Best First'.
Re: Splitting folded MIME headers into indivual headers?
by jeffa (Bishop) on Mar 02, 2015 at 21:12 UTC

    Did you have a look at or try something like MIME::Parser? It seems to be able to do what you need without you having to code the parsing yourself.

    jeffa

    L-LL-L--L-LL-L--L-LL-L--
    -R--R-RR-R--R-RR-R--R-RR
    B--B--B--B--B--B--B--B--
    H---H---H---H---H---H---
    (the triplet paradiddle with high-hat)
    

      Im already using that. What I need to do, is to get the generated data in name => value format, for input to Sendmail::PMilter.

      If PMilter could accept a opaque header object, I would just pass the output from MIME::Parser to PMilter, but now I need to use: $ctx->Addheader(NAME, VALUE)

      thus I need to have access to indivual header lines in a way that makes it possible to iterate over the headers.

      Any folding must be kept as-is to keep the output RFC compliant.

Re: Splitting folded MIME headers into indivual headers?
by roboticus (Chancellor) on Mar 02, 2015 at 20:48 UTC

    sebastiannielsen2:

    You're losing your first character in the headers because that "\S". Maybe you should try something more like:

    @fixedheaders = split(/\n+/, $fixedmsgheader);

    ...roboticus

    When your only tool is a hammer, all problems look like your thumb.

      How would that work? A folded header line does contain one single \n, and then the next line begins with any whitespace. A newline (\n) followed by a non-whitespace is the beginning of a new header.

      The whitespace a folded line begins with, may not neccessarly be a \n. In face \n would not be permitted, because a double \n (\n\n) would mark the start of body

      Example of a non-folded header line, combined with a folded header line combined with a non-folded one:

      Subject: Hi you are beutiful Content-Type: text/plain; format=flowed; charset="iso-8859-1"; reply-type=original Return-Path: <example@example.org>
      This should result in:
      $fixedheaders[0] = "Subject: Hi you are beutiful"; $fixedheaders[1] = "Content-Type: text/plain;\tformat=flowed;\tcharset +=\"iso-8859-1\";\treply-type=original"; $fixedheaders[2] = "Return-Path: <example@example.org>";

        sebastiannielsen2:

        In that case, I'd use:

        @fixedheaders = split /\n\b/, $fixedmsgheader;

        Update: checked my work with a test program:

        cat splitest.pl #!/usr/bin/env perl use strict; use warnings; my $t = q{ Subject: Hi there Content-Type: text/plain; format=flowed; charset="iso-1189-1"; Return-Path: <example@example.org> }; my @fields = split /\n\b/, $t; print join("\n***\n", @fields), "\n"; localadmins-MacBook-Pro-2:~ [mmason] $ perl splitest.pl *** Subject: Hi there *** Content-Type: text/plain; format=flowed; charset="iso-1189-1"; *** Return-Path: <example@example.org>

        ...roboticus

        When your only tool is a hammer, all problems look like your thumb.

        OOps the result was a Little bit wrong. It should be:

        $fixedheaders[0] = "Subject: Hi you are beutiful"; $fixedheaders[1] = "Content-Type: text/plain;\n\tformat=flowed;\n\tcha +rset +=\"iso-8859-1\";\n\treply-type=original"; $fixedheaders[2] = "Return-Path: <example@example.org>";

        Eg with newlines Before the tabs.

Re: Splitting folded MIME headers into indivual headers?
by hdb (Monsignor) on Mar 02, 2015 at 21:32 UTC

    What about good ol' line by line processing?

    use strict; use warnings; use Data::Dumper; my $t = q{ Subject: Hi there Content-Type: text/plain; format=flowed; charset="iso-1189-1"; Return-Path: <example@example.org> }; my @fields; for ( split /\n/, $t ) { push @fields, $_ and next if /^\S+:/; $fields[-1] .= $1 if /\s*(.+)/; } print Dumper \@fields;
Re: Splitting folded MIME headers into indivual headers?
by johngg (Canon) on Mar 03, 2015 at 10:20 UTC

    I tried with the following:

    @fixedheaders = split(/\n\S/, $fixedmsgheader);

    But that eats the first char in header lines.

    Have you tried using a look-ahead?

    @fixedheaders = split(/\n(?=\S)/, $fixedmsgheader);

    Not tested, but that should stop the regex consuming the first character.

    Cheers,

    JohnGG

Re: Splitting folded MIME headers into indivual headers?
by project129 (Beadle) on Mar 03, 2015 at 10:26 UTC

    Hi there!

    I propose use perl regexp positive look ahead:

    split /\n(?=\S)/, $fixwdmsgheader;

    p.s.: also please look at: Email::MIME - mail rfc have a lot of hidden issue/s

    good luck!