ady has asked for the wisdom of the Perl Monks concerning the following question:

Gassho!
I want to parse the plain text portion out of email messages ((in casu sent to MS Outlook but from any (ie unknown) mailers) end lisp).
Before i go dig & hack the modules at CPAN (pt i have very limited knowledge of mail formats, mime & such...), i'd like to ask if any of the reverend monks here can point me to a suitable module for this simple email text extraxtion task, preferably a "lite" version, as the simple extraction is the only thing i want to perform at this pt. of time.
A pointer to a digestible, crisp & low carb tutorial on mail formats (and possibly parsing) is also much appreciated.

Thanks --
Allan

As the eternal tranquility of Truth reveals itself to us, this very place is the Land of Lotuses-- Hakuin Ekaku Zenji

Replies are listed 'Best First'.
Re: parsing text portion out of email
by gellyfish (Monsignor) on Mar 31, 2005 at 13:44 UTC
      WOW!
      Seems Email::Simple did the job in ½ a minute, while i have been hacking at a manual filter for >1hour..., without quite fixing all loose ends.
      Thanks a heap mate! Best regards
      Allan
        NAAHH...
        Too fast there....
        Hangs on this code... gotta figure out why?

        #!/usr/bin/perl -w use strict; use Email::Simple; my $dir = "."; opendir (DH, $dir) or die "Can't open directory $dir: $!"; while (defined (my $file = readdir(DH))) { next if ($file =~ /^\.\.?$/); open (FH, "$dir\\$file") or die "Can't open $dir\\$file: $!\n"; my $e = do { local $/; <FH> }; # slurp .msg file my $mail = Email::Simple->new($e); # convert to mail object (?) $e = $mail->body; # retrieve body text print $e; # dump to STDOUT close(FH) or die "Can't close $dir\\$file: $!\n"; } closedir(DH) or die "Can't close $dir: $!";

        allan

      Two modules the OP can probably use well. If not, then MIME::Parser is a good option. I've recently used that and seems to work quite good. And of course Mail::Box if you have time to spare to dig through the extensive and often non-clear documentation ;)

      --
      b10m

      All code is usually tested, but rarely trusted.
Re: parsing text portion out of email
by cog (Parson) on Mar 31, 2005 at 13:47 UTC