No need to involve a browser at all. Here's one way, using the excellent HTML::TokeParser::Simple by the monastery's own Ovid.
#!/usr/bin/perl use warnings; use strict; use LWP::Simple; use HTML::TokeParser::Simple; my $page=get('http://www.page.you.want.com/some/path'); my $p = HTML::TokeParser::Simple->new( \$page ); while ( my $token = $p->get_token ) { # This prints all text in an HTML doc (i.e., it strips the HTML) next unless $token->is_text; print $token->as_is; }
Stuffing it into a file is left as an exercise for the poster.
-Any sufficiently advanced technology is
indistinguishable from doubletalk.
In reply to Re: save a page as text
by Hero Zzyzzx
in thread save a page as text
by Anonymous Monk
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |