phantom85 has asked for the wisdom of the Perl Monks concerning the following question:
Hello, so I am working with WWW::Mechanize to submit form and get class schedule and I have extracted the part of HTML I am interested in. My problem is now I want to put the data in the hash table and I don't know where to start. Following is the code I have so far, and I print it out to file just to test if it returns correct data
use WWW::Mechanize qw(); use IO::Socket::SSL qw(); use HTML::TreeBuilder; use 5.10.0; use strict; use warnings; my $mech = WWW::Mechanize->new(ssl_opts => { SSL_verify_mode => IO::Socket::SSL::SSL_VERIFY_NONE, verify_hostname => 0, }); my $url = "scheduleurl"; $mech->get($url); my $filename = 'out.htm'; my $result = $mech->submit_form( form_number => 2, fields => { "ctl00\$ContentPlaceHolder1\$TermDDL" => 2171, "ctl00\$ContentPlaceHolder1\$ClassSubject" => 'CS', } ,button => "ctl00\$ContentPlaceHolder1\$SearchButton" ); $mech->submit(); #print $result->content(); open(my $fhandle, '>', $filename) or die "Could not open file '$filena +me' $!"; my $tree = HTML::TreeBuilder->new_from_content($result->content); if (my $div = $tree->look_down(_tag => "div", id => "class_list")){ #print $div->as_text(), "\n"; # say $fhandle $div->as_HTML(),"\n"; my @list = $div->find(_tag => 'ol'); #print Dumper \@list; foreach (@list) { say $fhandle $_->as_HTML(); } } close $fhandle; $tree->delete();
So the script prints this to file I just pasted one item in the list but there are multiple items with the same format.
<ol> <li><span class="ClassTitle"><strong>CS 128</strong></span> Section 01 + <table border="0" cellpadding="5" cellspacing="0" class="GridView" +id="ClassDetails_TBL" width="99%"> <tr> <th align="right" id="TableHeaderCell8" nowrap>Class Nbr</th> <td id="TableCell13">11647</td> <th align="right" id="TableHeaderCell9" nowrap>Capacity</th> <td id="TableCell14">30</td> </tr> <tr> <th align="right" id="TableCell5" nowrap>Title</th> <td class="tablealtstyle" id="TableCell8">Introduction to +C++</td> <th align="right" id="TableCell8a" nowrap>Units</th> <td class="tablealtstyle" id="TableCell9">4</td> </tr> <tr> <th align="right" id="TableCell11" nowrap>Time</th> <td id="TableCell1">3:00 PM–4:50 PM +TuTh</td> <th align="right" id="TableCell15">Building/Room</th> <td id="TableCell2">8 52</td> </tr> </table> </li> <li></li> ...
so I want to put that data in to the hash table with keys being class titles and values are class information.
{ CS128 Section 01 => { Class Nbr => 11647, Capacity => 30, Title => Introduction to C++, Units => 4, Time => 3:00PM- 4:50PM TuTh, Room => 8 52 } }
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re: html to hash table
by Corion (Patriarch) on Oct 30, 2016 at 06:51 UTC | |
|
Re: html to hash table
by duyet (Friar) on Oct 30, 2016 at 09:29 UTC | |
by Anonymous Monk on Oct 30, 2016 at 22:59 UTC | |
by duyet (Friar) on Oct 31, 2016 at 11:38 UTC | |
|
Re: html to hash table
by perl-diddler (Chaplain) on Oct 30, 2016 at 22:41 UTC |