in reply to Re: Extracting TEXT from email
in thread Extracting TEXT from email

Thanks.
For further clarification : an example:

use Mail::Internet; $msgfile = "Angelee.msg"; open (MSG, "$msgfile") or die "Can't open $msgfile: $!\n"; $msg = new Mail::Internet \*MSG; close (MSG); $body = $msg->body(); $msg->print_body(\*STDOUT);

The message body as dumped to the terminal contains approx. 80% mail binary and HTML formating chars and only 20% corresponding to the transmitted TEXT payload.

I'd like a function to strip off all that junk:
msg_clean($body);

Ok, so i've written some regex filtering to do it, but that's hardly as as flexible & robust as a real MIME-knowledgeable parsing could be.
I expext there would be a msg_clean or body2text or equiv. function out there ?

allan

Replies are listed 'Best First'.
Re^3: Extracting TEXT from email
by PodMaster (Abbot) on Apr 30, 2005 at 11:52 UTC
    How about you go study mimeexplode?

    MJD says "you can't just make shit up and expect the computer to know what you mean, retardo!"
    I run a Win32 PPM repository for perl 5.6.x and 5.8.x -- I take requests (README).
    ** The third rule of perl club is a statement of fact: pod is sexy.

      I did try to run mimeexplode on a msg file, and it placed the body of the msg in a .txt file in a subdirectory.

      But this "exploded" file contains the same amount of binary & HTML "junk", so it does not spare me the job of post-filtering to get at the payload TEXT.
      -- allan
        But this "exploded" file contains the same amount of binary & HTML "junk", so it does not spare me the job of post-filtering to get at the payload TEXT.
        That sounds hinky. Can you post a sample file?

        MJD says "you can't just make shit up and expect the computer to know what you mean, retardo!"
        I run a Win32 PPM repository for perl 5.6.x and 5.8.x -- I take requests (README).
        ** The third rule of perl club is a statement of fact: pod is sexy.