and we're matching this:use strict; use warnings; use diagnostics; use HTML::TokeParser; use LWP::Simple; print &parse_diaryland('username', 'password'); sub parse_diaryland { my($username, $password); my($diary_url); my($parsee_man); my($html, $tag); my($body); $username = shift(); $password = shift(); $diary_url = "http://$username:$password\@$username.diaryland.com" +; # urls are in the form http://username:password@username.diaryla +nd.com $html = get($diary_url); print $html; $parsee_man = HTML::TokeParser->new(\$html); foreach $tag ($parsee_man->get_tag('TD')) { if ($tag->[1]{'align'} eq 'left' && $tag->[1]{'vAlign'} eq 'to +p') # fails here { my($secondtag); $secondtag = $parsee_man->get_tag(); if ($secondtag->[0] eq 'FONT') { $body = get_text('/FONT'); last(); # got the text body so quit loop } } } return($body); }
That fails with "use of uninitialised variable in string eq at line 30". I can vaguely make sense of this: the page we're getting has some TD tags earlier, and they lack the align and valign attributes, so that would be undefined. How can I fix this, though?<TD align=left vAlign=top><FONT face="Verdana, Arial, Helvetica, sans- +serif" size=2>I This is the text body we wanna grab. foo, bar and angst. </FONT>
In reply to Diaryland parsing by Amoe
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |