Fellow Monasterians:
Learning to use HTML::Parser to simply strip all the tags from the page and return the plain text. I have found examples of more complicated kinds of operations, but nothing quite this simple (yesterday's node got me started). Maybe it is my lack of understanding of how modules are used, or there are particulars re: H::P that are eluding me.
The following returns bless( { '_hparser_xs_state' => \138993616 }, 'HTML::Parser' ) which looks like a dereferencing issue, but not sure. Ideas? Thanks!#!/usr/bin/perl -w use warnings; use CGI::Carp qw(fatalsToBrowser); use HTML::Parser; use Data::Dumper; my $p = HTML::Parser->new(api_version => 3); my $text = $p->parse_file("../pages/about.html") || die print "$!"; print "Content-type: text/html\n\n"; print Dumper ($text); print $text."\n";
Update: added CPAN tag
In reply to Using HTML::Parser for simple tag removal by bradcathey
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |