http://qs1969.pair.com?node_id=491115

Inspired by Re: Extracting paragraphs from html, here's a bit of XML::LibXML code to fetch a web page and dump out all the large paragraphs.
use XML::LibXML; my $p = XML::LibXML->new; $p->recover(1); my $d = do { local *STDOUT; local *STDERR; open STDOUT, ">/dev/null"; open STDERR, ">/dev/null"; $p->parse_html_file("http://www.example.com/some/url"); }; for my $p ($d->findnodes(q{//text()[string-length() > 100]})) { print $p->toString; }