Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

Hello all, I'm a newbie in PERL and definitely want to learn. I'm playing around with the module Mail::POP3Client, trying to strip mails. One problem i had is that the body of the email contains Tags. Because i needed only the text content i strip away those tags using HTML::TreeBuilder and it left me with this ;
------=_NextPart_000_007F_01C180F0.5D9AD8A0 Content-Type: text/plain; +charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable i wonder how can we know that its the email's body ------=_NextPart_000_007F_01C180F0.5D9AD8A0 Content-Type: text/html; c +harset="iso-8859-1" Content-Transfer-Encoding: quoted-printable للل i wonder how can we = know that=20 its the email's body ------=_NextPart_000_007F_01C180F0.5D9AD8A0--
Is there a way to clean away these unnecessary strings and just leave the email text body.

By the way the body should only be "i wonder how can we know that its the email's body"

Any help will be greatly appreciated. thanks.

Replies are listed 'Best First'.
Re: striiping Email Body
by davorg (Chancellor) on Dec 10, 2001 at 19:30 UTC

    Sounds like you need the MIME::Parser module. It's part of the MIME-tools bundle of modules.

    --
    <http://www.dave.org.uk>

    "The first rule of Perl club is you do not talk about Perl club."
    -- Chip Salzenberg

Re: striiping Email Body
by atlantageek (Monk) on Dec 10, 2001 at 19:56 UTC
    You need To pull the body of the message out with Mime::Parser and then strip out the HTML tags. Infact the Mime::Parser should tell you how many parts there are and check if there is a text/plain mime type. If there is use this without stripping away html tags. If there is not a text/plain but there is a text/html then grab it and throw away the html tags. also the following should eliminate the html tags without using HTML::TreeBuilder $body =~ s/<.*?>//g
    ----
    I always wanted to be somebody... I guess I should have been more specific.