Newbie here. I'm trying to get the data from just the <title> tag of an HTML page.
I have some Perl code (cobbled together from some online examples) that can read the data from an HTML file, and I have an example code snippet that is supposed to read just the <title> tag.
My problem is figuring out how to make the two pieces of code work together. Or maybe I'm going down the wrong path. Any advice would be appreciated.
Here's the code to read in all the data from the HTML file:
#!/usr/bin/perl -w use strict; package Example; require HTML::Parser; @Example::ISA = qw(HTML::Parser); my $parser = Example->new; $parser->parse_file('index2.html'); print $parser->{TEXT}; sub text { my ($self,$text) = @_; $self->{TEXT} .= $text; }
And here's the code snippet, listed on the CPAN page for HTML::Parser, for extracting just the <title> tag data:
sub start_handler { return if shift ne "title"; my $self = shift; $self->handler(text => sub { print shift }, "dtext"); $self->handler(end => sub { shift->eof if shift eq "title"; }, "tagname,self"); } my $p = HTML::Parser->new(api_version => 3); $p->handler( start => \&start_handler, "tagname,self"); $p->parse_file(shift || die) || die $!; print "\n";
In reply to read HTML <title> tag by AngusScrimm
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |