http://qs1969.pair.com?node_id=244264


in reply to How to save a web page directly to plain text?

There is! One very easy way is to use the lynx text-mode browser to retrieve and save the file, using its -dump option. Not only does it print just the text, but attempts to format it (somewhat crudely) according to the html markup. lynx is available for most platforms, but unless you already have it, you might not consider this option "easy".

Another way is to use LWP (or LWP::Simple) to retrieve the file, and one of the HTML parsing modules (such as HTML::TreeBuilder) to parse the text out of it. For example:
my $URL = shift or die "Usage: $0 URL\n"; use LWP::Simple; use HTML::TreeBuilder; print HTML::TreeBuilder ->new_from_content( get( $URL ) or die "Error getting $URL\n" ) ->as_trimmed_text;

jdporter
The 6th Rule of Perl Club is -- There is no Rule #6.