in reply to Re: Re: Parsing email files, what are the best modules ?
in thread Parsing email files, what are the best modules ?

Hi Peter, you could read the PPM documentation like what the Anonymous Monk has suggested. Also you probably need all the Mail::Box and its derived modules as well.

I have complete the code I started earlier. The additional code is an example on the kind of thing you could do with the Mail::Box::Manager module. Pretty handy I think.
#C:\Perl\bin\Perl.exe -w use strict; use IO::File; use Data::Dumper; use Mail::Box; use Mail::Box::Manager; # Load mail list my $MailList = load_mail_list('./list25B6.txt'); print Dumper($MailList); # Load folder list my $MailFolder = load_mail_folders('./hierarch.txt'); print Dumper($MailFolder); # Parse folder files foreach (values %{$MailFolder}) { parse_mail_folder($_); } # Optionally output $MailList into another file, etc. # And other things ... exit(0); sub parse_mail_folder { my $folder_file = shift; my $mgr = Mail::Box::Manager->new(); my $folder = $mgr->open($folder_file); my @email_addr; foreach my $message ($folder->messages) { my $dest = $message->get('To'); # retrieve the To-address @email_addr = split /,/, $dest; # retrieve multiple addresses # assume the email address format is as follows - # # John & Jenny Arnold <johnarnold@somedomain.com> # # you have to tweak a bit if the format is not as expected # or use the Mail::Address module to do the trick - to # convert the mail address into its canonical form. foreach (@email_addr) { my ($name, $addr) = /(.*)<(.*)>/; $name = s/^\s+//g; # trim spaces at front $name = s/\s+$//g; # trim spaces at rear $addr = s/^\s+//g; # trim spaces at front $addr = s/\s+$//g; # trim spaces at rear if (! exists $MailList->{$addr}) { # ok, we haven't seen this Email address yet $MailList->{$addr} = $name; # and do other things } } } $folder->close; } sub load_mail_list { my $filename = shift; my $f = new IO::File $filename, "r" or die "Can not open mail list +"; my %mlist; # load the header chomp($mlist{title} = <$f>); chomp($mlist{sender} = <$f>); chomp($mlist{nosig} = <$f>); <$f>; # load the rest of the email addresses my %MailAddress; while (<$f>) { chomp; my ($name, $email) = /^(.*)\s+<(.*)>$/; next if $email eq ''; $MailAddress{$email} = $name; } $mlist{mlist} = \%MailAddress; return \%mlist; } sub load_mail_folders { my $filename = shift; my $f = new IO::File $filename, "r" or die "Can not open mail list +"; my %mbox; while (<$f>) { chomp; next unless ( $_ ne '' and m/^0,0,/ ); s/"//g; my @fld = split /,/; my ($folder) = $fld[2] =~ /.*:.*:(.*)/; $mbox{$fld[-1]} = "D:/Pmail/mail/$folder.PPM"; # full path to +mboxes } return \%mbox; }

Replies are listed 'Best First'.
Re: Re: Re: Re: Parsing email files, what are the best modules ?
by peterr (Scribe) on Nov 11, 2003 at 05:19 UTC
    Hi Roger,

    Also you probably need all the Mail::Box and its derived modules as well.

    I certainly got plenty of these messages

    Warning: prerequisite Scalar::Util failed to load: Can't locate Scalar +/Util.pm in @INC (@INC contains: D:/Perl/lib D:/Perl/site/lib .) at ( +eval 46) line 3. Warning: prerequisite Test::Harness 1.38 not found at D:/Perl/lib/ExtU +tils/MakeMaker.pm line 343.

    when running the Makefile.pl from the Mail::Box tar/archive. The problem is, I used

    perl -MCPAN -e "shell" cpan> install Scalar::Util

    and the error message still appeared, even though the install went okay ? Even re-installing Mail::Box

    D:\Perl\myscripts>\perl\bin\perl.exe -MCPAN -e "shell" cpan shell -- CPAN exploration and modules installation (v1.59_54) ReadLine support available (try 'install Bundle::CPAN') cpan> install Mail::Box CPAN: Storable loaded ok Going to read \.cpan\Metadata Database was generated on Tue, 11 Nov 2003 00:45:51 GMT Mail::Box is up to date. cpan> q Lockfile removed.

    and then running the Perl script, still gave the following

    D:\Perl\myscripts>\perl\bin\perl.exe checke~1.pl Can't locate Scalar/Util.pm in @INC (@INC contains: D:/Perl/lib D:/Per +l/site/lib .) at D:/Perl/site/lib/Mail/Reporter.pm line 9. BEGIN failed--compilation aborted at D:/Perl/site/lib/Mail/Reporter.pm + line 9. Compilation failed in require at (eval 1) line 3. ...propagated at D:/Perl/lib/base.pm line 62. BEGIN failed--compilation aborted at D:/Perl/site/lib/Mail/Box.pm line + 8. Compilation failed in require at checke~1.pl line 5. BEGIN failed--compilation aborted at checke~1.pl line 5.

    I have checked out all the "prerequisite" warning messages, made a note of those modules, then used the 'MCPAN' / shell to install them. The install appears to go okay, it goes out to the internet , parses through files on FTP sites, and says _that_ module has installed okay. ??

    Going back to where I think (but don't really know) where the perl script is stopping, is line 9 of Reporter.pm , which has

    Use Scalar::Util 'dualvar';

    and I know I have installed _that_ module. The other related code from the error messages are

    # msg - "...propagated at D:/Perl/lib/base.pm line 62." die if $@ && $@ !~ /^Can't locate .*? at \(eval /; # msg - "compilation aborted at D:/Perl/site/lib/Mail/Box.pm line 8." use base 'Mail::Reporter';

    I'm just about all debugged out, and have run out of clues.

    I have complete the code I started earlier. The additional code is an example on the kind of thing you could do with the Mail::Box::Manager module. Pretty handy I think.

    Thanks very much for that additional code, Roger. I guess the big question is, what is different on your Perl setup to mine ??

    Thanks a lot, :)

    Peter

      Looks like you have some maintenance to do. You need to download and install the Scalar::Util module from CPAN, probably other things you think is useful too. Can't perl without them. 8^p

      By the way, the perl version I use is Active Perl 5.8.1.

        Hi Roger,

        You need to download and install the Scalar::Util module from CPAN, probably other things you think is useful too.

        Okay, I assume I can just "install" by running Perl as a shell, then install the modules I need, like this.

        D:\Perl\myscripts>\perl\bin\perl.exe -MCPAN -e "shell"

        By the way, the perl version I use is Active Perl 5.8.1.

        I could only find ActivePerl 5.8.0 build 806 , from http://www.activestate.com/Products/Download/Download.plex?id=ActivePerl

        Thanks,

        Peter

        Roger,

        By the way, the perl version I use is Active Perl 5.8.1

        Can you please tell me where to get version 5.8.1, I can only get ActivePerl 5.8.0 build 806

        I think I may have screwed up some of the Perl modules, because I used PPM, the shell version of CPAN, and even d/loading Mail::Box from the website, uncompressing it, and then doing the 'makefile', etc. To me, it's confusing to know what I _should_ use to install any missing modules. Also, I don't have a C compiler installed, that could be complicating things. :(

        Anyway, I'm going to un-install everything, then install "ActivePerl 5.8.0 build 806", then try and run the sample script you supplied, and then take it slowly, step by step, fixing up any 'errors' (missing modules,etc).

        Thanks for your help,

        Peter