making like `lynx -dump`

Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

Replies are listed 'Best First'.
RE: making like `lynx -dump` by Anonymous Monk on Mar 13, 2000 at 22:13 UTC
You can try HTML::Parser or HTML::TokeParser.	[reply]
RE: RE: making like `lynx -dump` by Anonymous Monk on Mar 14, 2000 at 12:03 UTC
I found a really good one actually.. HTML::FormatText. works like a charm! Yeah Tim, I was doing something like that previously but my application has gotten more a little more critical and due to network lapses etc, I need to make sure I get the file. So I had the choice of either writing a wrapper for lynx or writing a more flexible page getting program... I went with the latter. Thanks for all your help.. it's greatly appreciated	[reply]
RE: making like `lynx -dump` by vroom (His Eminence) on Mar 14, 2000 at 00:07 UTC
If you have lynx on the system you may as well just do $text=`lynx -dump $url`; [download] vroom \| Tim Vroom \| vroom@cs.hope.edu	[reply] [d/l]
Re: making like `lynx -dump` by Anonymous Monk on Mar 14, 2000 at 01:47 UTC
Well, you can do this (assuming that you've slurped the entire page to $page): `$page =~ s{< \s* BR .* >\|< \s* P .* >}{$/}egisx; $page =~ s{< .* >}{}egisx;` [download] Then again, as Vroom said, if you have lynx on the system, it's much easier to just use the backquotes.	[reply] [d/l]
Re: making like `lynx -dump` by btrott (Parson) on Mar 14, 2000 at 04:48 UTC
Take a look at the program that runs news.perl.org --the mailing list message is created by some Perl code that produces an output quite similar to lynx -dump. Specifically, look towards the bottom of the file for the HTML::FormatText::AddRefs package; then look at the get_mail_text routine.	[reply]