This generates the following output:use strict; use HTML::FormatText; use HTML::Parse; my $data = do {local $/;<DATA>}; my $html = parse_html($data); my $formatter = HTML::FormatText->new( leftmargin => 0, rightmargin => 50, ); my $ascii = $formatter->format($html); print "$ascii\n"; __DATA__ <p class="fol">Here's some text that goes in the body of the article. It has some list items like this:</p> <ul> <li>List item one</li> <li>List item two</li> </ul>
Here's some text that goes in the body of the article. It has some list items like this: * List item one * List item twoI have found that converting HTML to text is hard, and the best free tool i have found so far is lynx -dump. Of course, the most optimal solution is to never mix presentation with data! :)
Update:
in case you are wondering where that extra bullet
came from, it is the result of the closing li tags. Looks
like HTML::FormatText could use an upgrade to support
XHTML. -- good catch Hero Zzyzzx! ;) I fixed
this typo since hacker requested i fix the original. For
historical purposes, the first list item looked like so:
<li>List item one<li>.
jeffa
Remember kids, just say no to mixing data and presentation!In reply to (jeffa) Re: HTML input to PDF output
by jeffa
in thread HTML input to PDF output
by hacker
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |