I am having a slight issue using the HTML::TreeBuilder::XPath module. I have a block of HTML as such (really doesn't matter the info here, there are thousands in the same format)
<div id="filerDiv"> <div class="mailer">Mailing Address <span class="mailerAddress">65 MARKET STREET, SUITE 1207,</spa +n> <span class="mailerAddress">CAMANA BAY, P.O. BOX 31110</span> <span class="mailerAddress">GRAND CAYMAN E9 KY1-1205</span> </div> <div class="mailer">Business Address <span class="mailerAddress">65 MARKET STREET, SUITE 1207,</span> <span class="mailerAddress">CAMANA BAY, P.O. BOX 31110</span> <span class="mailerAddress">GRAND CAYMAN E9 KY1-1205</span> <span class="mailerAddress">345 943 4573</span> </div> <div class="companyInfo"> <span class="companyName">GREENLIGHT CAPITAL RE, LTD. (Filer) <acronym title="Central Index Key">CIK</acronym>: <a href="/cg +i-bin/browse-edgar?CIK=0001385613&amp;action=getcompany">0001385613 ( +see all company filings)</a></span> <p class="identInfo"><acronym title="Internal Revenue Service +Number">IRS No.</acronym>: <strong>000000000</strong><br />Type: <str +ong>10-Q</strong> | Act: <strong>34</strong> | File No.: <a href="/cg +i-bin/browse-edgar?filenum=001-33493&amp;action=getcompany"><strong>0 +01-33493</strong></a> | Film No.: <strong>161612131</strong><br /><ac +ronym title="Standard Industrial Code">SIC</acronym>: <b><a href="/cg +i-bin/browse-edgar?action=getcompany&amp;SIC=6331&amp;owner=include"> +6331</a></b> Fire, Marine &amp; Casualty Insurance<br />Assistant Dir +ector 1</p> </div> </div>
What I need it to do is for the second div with the class "mailer" I need to get the text information from the spans within the block. I have been messing around with it for a while now but I can only ever get all the text into one line. I would like to be able to store each span individually in an array, so in this instance there are 4 spans, I would like there to be 4 array elements. Here is a snippet of the code I am using to parse the file.
my $root = HTML::TreeBuilder::XPath->new; $root->parse($content); my @Baddress = $root->findvalue('//div[@id="filerDiv"]/div[@class= +"mailer"][2]/span/text()');
Any kind of help would be greatly appreciated.
Update

I figured it out. I was using $root->findvalue and not $root->findvalues so evrything was being assigned to 1 variable. Thanks for reading


In reply to Help with HTML::TreeBuilder::XPath by edimusrex

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.