in reply to Problem extracting an HTML table with Perl

The problem is your visualiztion method (I think). Data::Dump outputs all entries in the object, and HTML::Elements are built with _parent keys so you can navigate bidirectionally. If you output with
print $data->as_HTML;
instead, I think you'll get what you expect.

#11929 First ask yourself `How would I do this without a computer?' Then have the computer do it the same way.

Replies are listed 'Best First'.
Re^2: Problem extracting an HTML table with Perl
by Sosi (Sexton) on Aug 11, 2014 at 17:05 UTC

    Indeed, it got a bit better, but I am still getting a lot of information. I now found that my search is completely independent of that "class" in my $tree->find. So any of the following alternatives gives the same result, and shows that the search is only done on the tag:

    my $data =$tree->find( '_tag' =>'div' );

    or even

    my $data =$tree->find( '_tag' =>'div', class => 'somethingthatdoesnotexists1209841290r' );
      That is not what I see, and I note the OP used the look down method instead of the find method as you have in this post.

      If I run

      #!/usr/local/bin/perl use strict; use warnings; use autodie; use Data::Dump; use HTML::Tree; use LWP::Simple qw(get); my $content=get('http://www.ncbi.nlm.nih.gov/genome/?term=Xylella_fast +idiosa'); my $tree = HTML::Tree->new(); $tree->parse($content); my $data =$tree->look_down( '_tag' =>'div', class => 'genome_descr' ); print $data->as_HTML;
      I get the output
      <div class="genome_descr"><p><b>Submitter: </b><a href="http://aeg.lbi +.ic.unicamp.br/xf/" target="_blank">Sao Paulo state (Brazil) Consorti +um</a></div>
      If I run with
      my @data =$tree->look_down( '_tag' =>'div', class => 'genome_descr' );
      instead, I get 2 results. How does this compare for you?

      #11929 First ask yourself `How would I do this without a computer?' Then have the computer do it the same way.

        I had missed that of the find. I now found that I chose the wrong tag given that I wanted the list that shows after this tag, but I'll get to that later.

        Indeed I now see that I cannot dd $data as it dumps everything - the print $data->as_HTML solves the problem. One more question: in your second example, how did you print the data?

        Thanks so much!