Beefy Boxes and Bandwidth Generously Provided by pair Networks
Syntactic Confectionery Delight
 
PerlMonks  

Display html in plaintext

by Spida (Acolyte)
on Oct 09, 2002 at 12:59 UTC ( [id://203910]=perlquestion: print w/replies, xml ) Need Help??

Spida has asked for the wisdom of the Perl Monks concerning the following question:

I'm trying to parse mails in a perl-script (see http://www.perlmonks.org/index.pl?node_id=203907) and add the contents in a db (and display it in a webfrontend), but occasionally (and getting more frequent) mails have the text in a html attachment.
To display the contents even if the mail is in html, I'm looking for something that would
- strip out all html-tags (no problem, looking for \< and \>)
- let a bit of the formatting be visible
There was such a tool which tried to keep even tables, but I didn't find anything promising.
Please help.

Replies are listed 'Best First'.
Re: Display html in plaintext
by fruiture (Curate) on Oct 09, 2002 at 13:32 UTC

    One of the major reasons (or the major reason) for Perl's power is the CPAN. Whenever there's is task, that you think might have been done before several times, search.cpan.org for it: HTML::FormatText.

    update: ABBR and ACRONYM are standard HTML elements. Don't be confused.

    --
    http://fruiture.de
Re: Display html in plaintext
by arturo (Vicar) on Oct 09, 2002 at 13:41 UTC

    Note that there are command-line tools that you may have available that will do pretty much what you want; no need to whip up a perl script when lynx will do:

    lynx --dump <url>

    This preserves some of the formatting, and labels the links as well. Lynx' table-savvy descendant, links, also takes --dump as an option, but the output is not as nice.

    If not P, what? Q maybe?
    "Sidney Morgenbesser"

      w3m is my favorite tool to convert to text and has good support for tables as well, in fact that is why I started using. I needed to send out emails with an HTML and text version of the same inforamtion that was in a table and w3m is working nicely.
      See this node for more information.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://203910]
Approved by mattriff
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others exploiting the Monastery: (6)
As of 2024-04-25 11:04 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found