You could have a look at Mail::Box module from CPAN. I assume that the mail folders are in the format of unix mboxes, ascii-mode, line-by-line.

I have started on a simple perl app to do what you have described...
#C:\Perl\bin\Perl.exe -w use strict; use IO::File; use Data::Dumper; use Mail::Box; # Load mail list my $MailList = load_mail_list('./list25B6.txt'); print Dumper($MailList); # Load folder list my $MailFolder = load_mail_folders('./hierarch.txt'); print Dumper($MailFolder); # Parse folder files foreach (values %{$MailFolder}) { parse_mail_folder($_); } sub parse_mail_folder { # to be completed when I get back home... } sub load_mail_list { my $filename = shift; my $f = new IO::File $filename, "r" or die "Can not open mail list +"; my %mlist; # load the header chomp($mlist{title} = <$f>); chomp($mlist{sender} = <$f>); chomp($mlist{nosig} = <$f>); # load the rest of the email addresses my %MailAddress; while (<$f>) { chomp; my ($name, $email) = /^(.*)\s+<(.*)>$/; next if $email eq ''; $MailAddress{$email} = $name; } $mlist{mlist} = \%MailAddress; return \%mlist; } sub load_mail_folders { my $filename = shift; my $f = new IO::File $filename, "r" or die "Can not open mail fold +er list"; my %mbox; while (<$f>) { chomp; next unless ( $_ ne '' and m/^0,0,/ ); s/"//g; my @fld = split /,/; my $folder = (split /:/, $fld[2])[2]; # capture 3rd field $mbox{$fld[-1]} = "D:/Pmail/mail/$folder.PPM"; # full path to +mboxes } return \%mbox; }
And the output so far...
$VAR1 = { 'title' => '\\TITLE Email Distribution', 'nosig' => '\\NOSIG Y', 'mlist' => { 'jbarker@someotherdomain.org' => 'David & Jan B +arker', 'johnarnold@somedomain.com' => 'John & Jenny Ar +nold' }, 'sender' => '\\SENDER Peter Rabbitt <peterr@example.com>' }; $VAR1 = { 'Main' => 'D:/Pmail/mail/FOL07093.PPM', 'Microsoft' => 'D:/Pmail/mail/FOL024EB.PPM', 'Sent' => 'D:/Pmail/mail/FOL04816.PPM' };

In reply to Re: Parsing email files, what are the best modules ? by Roger
in thread Parsing email files, what are the best modules ? by peterr

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.