HTML::TreeBuilder::XPath is too slow, contain memory leaks and buggy. So I recommend to use HTML::TreeBuilder::LibXML instead:
use strict; use HTML::TreeBuilder::LibXML; use Data::Dumper; my $html = <<'HTML'; <html><body> <table> <tr><th>Firstseen (UTC)</th><th>Version</th><th>Feodo C&amp;C</th><th> +Status</th><th>SBL</th><th>ASN</th><th>Country</th><th>Lastseen (UTC) +</th></tr> #parsing the below <tr bgcolor="#9d9595" onmouseover="this.style.backgroundColor='#FFA200 +';" onmouseout="this.style.backgroundColor='#9d9595';"><td>2016-03-19 + 23:44:36</td><td bgcolor="#58D3F7" align="center"><strong>D</strong> +</td><td><a href="/host/83.172.215.87/" target="_parent" title="Show +more information about this Feodo C&amp;C">83.172.215.87</a></td><td +bgcolor="#4f883f">offline</td><td bgcolor="#bc5959"><a href="http://w +ww.spamhaus.org/sbl/sbl.lasso?query=SBL290535" target="_blank" title= +"Spamhaus SBL: SBL290535">SBL290535</a></td><td>AS12651 IPWORLDCOM</t +d><td><img src="images/flags/ch.gif" alt="-" title="CH (CH)" width="1 +6" height="10" /> CH</td><td>never</td></tr> <tr bgcolor="#837b7b" onmouseover="this.style.backgroundColor='#FFA200 +';" onmouseout="this.style.backgroundColor='#837b7b';"><td>2016-03-19 + 23:44:36</td><td bgcolor="#58D3F7" align="center"><strong>D</strong> +</td><td><a href="/host/98.23.159.86/" target="_parent" title="Show m +ore information about this Feodo C&amp;C">98.23.159.86</a></td><td bg +color="#4f883f">offline</td><td bgcolor="#4f883f">Not listed</td><td> +AS7029 WINDSTREAM</td><td><img src="images/flags/us.gif" alt="-" titl +e="US (US)" width="16" height="10" /> US</td><td>never</td></tr> <tr bgcolor="#9d9595" onmouseover="this.style.backgroundColor='#FFA200 +';" onmouseout="this.style.backgroundColor='#9d9595';"><td>2016-03-19 + 23:44:36</td><td bgcolor="#58D3F7" align="center"><strong>D</strong> +</td><td><a href="/host/178.188.14.86/" target="_parent" title="Show +more information about this Feodo C&amp;C">178.188.14.86</a></td><td +bgcolor="#4f883f">offline</td><td bgcolor="#4f883f">Not listed</td><t +d>AS8447 TELEKOM-AT</td><td><img src="images/flags/at.gif" alt="-" ti +tle="AT (AT)" width="16" height="10" /> AT</td><td>2016-03-24 01:19:5 +0</td></tr> </table> </body></html> HTML my $tree = HTML::TreeBuilder::LibXML->new; $tree->parse($html); $tree->eof; my @tr_nodes = $tree->findnodes('//tr[td]'); foreach my $tr_node (@tr_nodes) { my @text = $tr_node->findvalues('td'); #my @text = $tr_node->findvalue('td'); #compare with this one! fin +dvalue will contact all nodes for you print Dumper( \@text ); #do something with @text... }

In reply to Re: Problem getting fields out of an XPath node list by Gangabass
in thread Problem getting fields out of an XPath node list by ejc1

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.