Re: Please help with a fetching issue

Either of these (tuned to your need) should do the trick. HTML::TokeParser::Simple or XML::LibXML.

use LWP::Simple qw( get );
use HTML::TokeParser::Simple;

my $page = get(+shift || die "Gimme URI!\n");

my $p = HTML::TokeParser::Simple->new(\$page);

while ( my $token = $p->get_tag("td") )
{
    next unless $token->get_attr("class") =~ /\bsomeClass\b/;
    my $first_child = $p->get_token();
    print $first_child->as_is, $/;
    last;
}

use XML::LibXML;

my $p = XML::LibXML->new;
$p->recover_silently(1);
my $doc = $p->parse_html_string($page);
my ( $td ) = $doc->findnodes('//td[@class="someClass"]');
print $td->textContent, $/;
[download]

(update: rolled into single <code/> and fixed tag name.

Comment on Re: Please help with a fetching issue Select or Download Code

Replies are listed 'Best First'.
Re^2: Please help with a fetching issue by mahira (Acolyte) on Mar 25, 2010 at 06:50 UTC
Thank you very much. I was not able to utilize the solutions above. I don't know why but I think it is something related with the page... At the end, I was able to fix the issue with some regex. But this time I used a tag right before the table division. After fetching the page: `$page =~ s/(\n\|\r)/<!--xxx-->/g; $page =~ s/.<!--start\stag-->(.)<!--end\stag-->.*/$1/; $page =~ s/<!--xxx-->/\n/g;` [download] Thanks again for your help.	[reply] [d/l]

Replies are listed 'Best First'.

Re^2: Please help with a fetching issue
by mahira (Acolyte) on Mar 25, 2010 at 06:50 UTC

Thank you very much.

I was not able to utilize the solutions above. I don't know why but I think it is something related with the page...

At the end, I was able to fix the issue with some regex. But this time I used a tag right before the table division. After fetching the page:

$page =~ s/(\n|\r)/<!--xxx-->/g;
$page =~ s/.*<!--start\stag-->(.*)<!--end\stag-->.*/$1/;
$page =~ s/<!--xxx-->/\n/g;
[download]

Thanks again for your help.

[reply]
[d/l]