uni_j has asked for the wisdom of the Perl Monks concerning the following question:

Hey guys, My code (a web scraper) is designed to grab the information from a website site. Everything is good, but my main problem has to do with my last hash ("%dep2"). It is suppose to have a lot of data because in my "%dep" hash loop(without the '2'). Also, if anyone wants to show me how to keep reusing the same $stream, I'm sure that would clean up my code a bit ;)
use WWW::Mechanize; use HTML::TokeParser; my $c_url = 'https://registrar.utm.utoronto.ca/student/timetable/new_t +t_calprev.php?course='.$pre; my $url = 'https://registrar.utm.utoronto.ca/student/timetable/index.p +hp'; my $mech = WWW::Mechanize->new(); my $stream = HTML::TokeParser->new(\$html); my $streamm = HTML::TokeParser->new(\$htmll); my $streammm = HTML::TokeParser->new(\$htmlll); my %dep; my %dep2; my @coursename; $mech->get($url); $html = $mech->content(); $stream->get_tag("option"); $term = 20095; # 20095 = Summer $year = 1 ;# 1st = 1 , 2nd = 2, 3rd = 3, 4th = 4 my @c_arr ;# Course array while(my $token = $stream->get_token()){ if($token->[2]{name} eq "dept[]"){ while(my $token = $stream->get_token("option")){ if($token->[1] eq "option"){ $c = $token->[2]{value}; $d = $stream->get_trimmed_text(); $dep{ $c } = $d; } } } } while (($key, $v) = each(%dep)){ + #submit options to list course $coursecode = $key; $coursecode =~ s/\s//g; $mech->get($url); $mech->submit_form( fields=>{ session=>$term, 'yos[]'=>$year, 'dept[]'=>$key, }, ); $htmll = $mech->content(); while(my $tokenn = $streamm->get_token()){ + #grab the prefixes from the listing so we can directly if($tokenn->[1] eq "a" && $tokenn->[2]{name}){ push(@coursename,$tokenn->[2]{name}); $c = $tokenn->[2]{name}; $dep2 { $key } = $c; } } + #end while for prefix } while (($foo, $bar) = each(%dep2)) { print $foo; print $bar; }

Replies are listed 'Best First'.
Re: Hash loop debugging.
by repellent (Priest) on Apr 24, 2009 at 23:53 UTC
      It is suppose to have a lot of data because in my "%dep" hash loop(without the '2').

    ..... and what is the problem specifically? The sentence needs to be completed.
      I have a loop (for each entity in hash '%dep'). Inside that loop I am adding a variable to %dep2. When I print off all values in %dep2, I only have 1 member of the hash. This isn't supposed to happen because %dep has a ton of members in it's hash.
        Let's take it from the top:
        my $mech = WWW::Mechanize->new(); my $stream = HTML::TokeParser->new(\$html); my $streamm = HTML::TokeParser->new(\$htmll); my $streammm = HTML::TokeParser->new(\$htmlll); my %dep; my %dep2; my @coursename; $mech->get($url); $html = $mech->content();

        You need to declare $html, $htmll, etc. prior to initializing $stream, $streamm, etc.
        Fix that first, and update the code.

        Please use strict; It would have caught undeclared variables.

        Also heed the advice at: How do I post a question effectively?