monsterzero has asked for the wisdom of the Perl Monks concerning the following question:
I am trying to use HTML::TreeBuilder to parse some HTML data I have retrieved from the web. Specifily, I would like to extract the sunrise/sunset data from the web page. Below is what I have tried. The attribute I am looking for is everything between the pre tags, However I am afraid I do not understand what is being displayed when I print the all_attr method :(
Can anyone shead some light on this?
Thanks
Ron Hill
use strict; use warnings; use HTML::TreeBuilder; my $data = do { local $/; <DATA> }; my $tree = HTML::TreeBuilder->new_from_content($data); print $tree->all_attr(); __DATA__ <html> <head><title>Sun and Moon Data for One Day</title></head> <body> <br> <h4>U.S. Naval Observatory<br>Astronomical Applications Department</h4 +> <br> <h3>Sun and Moon Data for One Day</h3> <p>The following information is provided for Adelaide Australia (longitude E138.6, latitude S34.9): </p> <pre> Saturday 21 June 2003 Universal Time + 9h <strong>SUN</strong> Begin civil twilight 06:25 Sunrise 06:53 Sun transit 11:47 Sunset 16:41 End civil twilight 17:10 <strong>MOON</strong> Moonrise 22:45 on preceding day Moon transit 05:24 Moonset 11:53 Moonrise 23:43 Moonset 12:19 on following day </pre> <p>Last quarter Moon on 21 June 2003 at 23:45 (Universal Time + 9h). </p> <br> <br> <br> </body> </html>
Edit by tye, replace PRE with P tags
|
---|
Replies are listed 'Best First'. | |
---|---|
Re: Parsing HTML using TreeBuilder
by Art_XIV (Hermit) on Oct 28, 2003 at 20:27 UTC |