Re: parsing html link

You were nearly there (if I've understood your question correctly)

#!/usr/bin/perl

use warnings;
use strict;
use LWP::Simple;
use HTML::TreeBuilder;

print "ELENCO LIGANDI\n";

my $url3 = "http://www.ebi.ac.uk/thornton-srv/databases/cgi-bin/pdbsum
+/GetPage.pl?pdbcode=2j6p&template=ligands.html&l=1.1";
my $content =get ($url3);

my $p = HTML::TreeBuilder->new;
$p->parse_content($content);

my @anchors = $p->look_down(_tag => q{a});
for my $anchor (@anchors){
  my $txt = $anchor->as_text;  
  if ($txt=~ /EPE\s/){
    print $txt, qq{\n};
    my $href = $anchor->attr(q{href});
    print $href, qq{\n};
  }
}

$p->delete;

__DATA__
output:
ELENCO LIGANDI
EPE 1148(C)
/thornton-srv/databases/cgi-bin/pdbsum/GetPage.pl?pdbcode=2j6p&templat
+e=ligands.html&l=4.1
EPE 1148(D)
/thornton-srv/databases/cgi-bin/pdbsum/GetPage.pl?pdbcode=2j6p&templat
+e=ligands.html&l=4.2
[download]

update: changed var names to anchor/s

Comment on Re: parsing html link Download Code

Replies are listed 'Best First'.
SOLVED: parsing html link by paola82 (Sexton) on May 25, 2009 at 10:34 UTC
thanks you solved my question, is it enough if I put there solved and paste your code below???For beginner like me, I suggest to read the previews posts #!/usr/bin/perl use warnings; use strict; use LWP::Simple; use HTML::TreeBuilder; print "ELENCO LIGANDI\n"; my $url3 = "http://www.ebi.ac.uk/thornton-srv/databases/cgi-bin/pdbsum +/GetPage.pl?pdbcode=2j6p&template=ligands.html&l=1.1"; my $content =get ($url3); my $p = HTML::TreeBuilder->new; $p->parse_content($content); my @anchors = $p->look_down(_tag => q{a}); for my $anchor (@anchors){ my $txt = $anchor->as_text; if ($txt=~ /EPE\s/){ print $txt, qq{\n}; my $href = $anchor->attr(q{href}); print $href, qq{\n}; } } [download]	[reply] [d/l]

Replies are listed 'Best First'.

SOLVED: parsing html link
by paola82 (Sexton) on May 25, 2009 at 10:34 UTC

thanks you solved my question, is it enough if I put there solved and paste your code below???For beginner like me, I suggest to read the previews posts

#!/usr/bin/perl

use warnings;
use strict;
use LWP::Simple;
use HTML::TreeBuilder;

print "ELENCO LIGANDI\n";

my $url3 = "http://www.ebi.ac.uk/thornton-srv/databases/cgi-bin/pdbsum
+/GetPage.pl?pdbcode=2j6p&template=ligands.html&l=1.1";
my $content =get ($url3);

my $p = HTML::TreeBuilder->new;
$p->parse_content($content);

my @anchors = $p->look_down(_tag => q{a});
for my $anchor (@anchors){
  my $txt = $anchor->as_text;  
  if ($txt=~ /EPE\s/){
    print $txt, qq{\n};
    my $href = $anchor->attr(q{href});
    print $href, qq{\n};
  }
}
[download]

[reply]
[d/l]