Popcorn Dave has asked for the wisdom of the Perl Monks concerning the following question:
I'm working on an application that is parsing information from a web page - specifically tables on said page. All is working just fine until I push the data on to my HoA. The data appears correct in the data structure - I've watched it running under the Tk Debugger. As far as I can tell from what I've read, if I read correctly, Perl thinks that some of the data isn't defined in my anon hash. I realize I can "ignore" this by turning off warnings but I'd rather get to the bottom of the problem before I turn off warnings.
As my code shows I'm setting values to a physical space if there's a html space on the table cell.
sub process_info{ my ($temp, $year, $info); do {($token = $stream->get_token)} until $token->[0] eq "C" and $t +oken->[1] =~ /1958/; while($token = $stream->get_token) { if ($token->[0] eq "S" and ${$token->[2]}{year}){ $year = ${ +$token->[2]}{year} } if ($token->[0] eq "S"){ if (${$token->[2]}{id}){ $temp = &get_token("td"); my @names = split(" ",$temp); $info->{last} = pop(@names); $info->{first} = join( " ", @names ); $temp = &get_token("td"); if ($temp eq " "){ $info->{addr} = ''; } else{ $info->{addr} = $temp; } $temp = &get_token("br"); if ($temp eq " "){ ($info->{city}, $info->{state}, $info->{zip}) + = ('', '', ''); } else{ my $t2; ($info->{city}, $t2) = split(', ',$temp); ($info->{state},$t2) = split(" ",$t2); $t2 =~ s/\s+//; $info->{zip} = $t2; } $temp = &get_token("td"); if ($temp eq " "){ $info->{phone} = ''; } else{ $info->{phone} = $temp; } $temp = &get_token("a"); if ($temp eq " "){ $info->{email} = ''; } else{ $info->{email} = $temp; } push( @{$bros->{$year}}, $info); $info = {}; # reset $info hash } # end ${$token->[2]}{id} } # end $token->[0] eq "S" } # end while } # end sub
And a sample of the HTML I'm parsing:
<!-- Start 1958 --> <TR id="start"> <TD>Rusty Bartel</TD> <TD> <BR> </TD> <TD> </TD> <TD><A HREF=""> </A></TD> </TR> <TR id="start"> <TD>Charles Brown</TD> <TD>123 Main Ave<BR>Sebastopol, CA 95472</TD> <TD> </TD> <TD><A HREF=""> </A></TD> </TR> <TR id="start"> <TD>Ardon Milkes</TD> <TD>345 George Dr<BR>Springfield, VA 22152</TD> <TD(503) 555-1212</TD> <TD><A HREF="mailto:me@nobody.com">me@nobody.com</A></TD> </TR>
I've even tried putting in a '?' so I wouldn't have a physical space if that was what was throwing the warning, but that didn't work either. As far as I can tell, all the data in the anon hash is filled before I push it.
Can anybody shed any light on this?
TIA!
|
|---|