Re: More efficient use of HTML::TokeParser::Simple

Replies are listed 'Best First'.
Re^2: More efficient use of HTML::TokeParser::Simple by henka (Novice) on Jul 11, 2006 at 06:17 UTC
I poked around HTML::TreeBuilder, but my goodness, things are complicated. It may not seem like it to seasoned monks, but to a C programmer, the OO aspects and data structures of perl are, well, daunting. Gleaning how to do something as simple as the one I posted here from the perl module docs is almost always an excercise in frustration.	[reply]
Re^3: More efficient use of HTML::TokeParser::Simple by GrandFather (Saint) on Jul 11, 2006 at 08:47 UTC
Here's a trivial example that seems to do something like what you want and may be enough to get you started with TreeBuilder: use warnings; use strict; use HTML::TreeBuilder; my $html = do {local $/; <DATA>}; my $tree = HTML::TreeBuilder->new (); $tree->parse ($html); $tree->eof (); $tree->elementify(); my ($title) = $tree->find ('title'); my @h1 = $tree->find ('h1'); print $title->as_text (), "\n"; print $_->as_text (), "\n" for @h1; __DATA__ <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN"> <!-- Took this out for IE6ites "http://www.w3.org/TR/REC-html40/loose. +dtd" --> <html lang="en"> <head> <title>More efficient use of HTML::TokeParser::Simple perlquestion + id:560199</title> </head> <body> <h1>Header 1</h1> <p>First paragraph</p> <h1>Header 2</h1> <p>Second paragraph</p> <h2>Level 2 header 1</h2> </body> </html> [download] Prints: `More efficient use of HTML::TokeParser::Simple perlquestion id:560199 Header 1 Header 2` [download] DWIM is Perl's answer to Gödel	[reply] [d/l] [select]
Re^4: More efficient use of HTML::TokeParser::Simple by wfsp (Abbot) on Oct 30, 2008 at 16:07 UTC
What does `$tree->elementify();` [download] do here? It appears to run ok if it is commented out. I've often seen it in snippets and have no idea what purpose it serves.	[reply] [d/l]
Re^5: More efficient use of HTML::TokeParser::Simple by GrandFather (Saint) on Oct 30, 2008 at 19:52 UTC