Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl Monk, Perl Meditation
 
PerlMonks  

comment on

( [id://3333]=superdoc: print w/replies, xml ) Need Help??
brainpan emerges from the shadows of the monastery and humbly seeks enlightenment from his elders.

I'm wanting to parse the content out of an html table using HTML::TableExtract. For most of the data this only takes a few lines, but for some reason I can't make it search for a header when that header consists only of an image (for which I know the URL). I assume that the source of the problem lies in the fact that, as TableExtract is a subclass of HTML::Parser, it's no longer seeing the url for the image as text that it should be parsing. If I were dealing with HTML::TokeParser I'd work around this with a line like this:

$tokeparser->{textify} = {img => 'src'};

However, I can't figure out how to do this with HTML::Parser. Am I approaching this the right way? Do I need to 'textify' HTML::Parser objects to make HTML::TableExtract search for the image's url, or can all this be done interfacing only with TableExtract? Is there some better way to extract the data from an HTML table when using an image as an anchor point?

And no, I don't own 27 pairs of sweatpants.

In reply to using the headers method of HTML::TableExtract to find an image by brainpan

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post; it's "PerlMonks-approved HTML":



  • Are you posting in the right place? Check out Where do I post X? to know for sure.
  • Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
    <code> <a> <b> <big> <blockquote> <br /> <dd> <dl> <dt> <em> <font> <h1> <h2> <h3> <h4> <h5> <h6> <hr /> <i> <li> <nbsp> <ol> <p> <small> <strike> <strong> <sub> <sup> <table> <td> <th> <tr> <tt> <u> <ul>
  • Snippets of code should be wrapped in <code> tags not <pre> tags. In fact, <pre> tags should generally be avoided. If they must be used, extreme care should be taken to ensure that their contents do not have long lines (<70 chars), in order to prevent horizontal scrolling (and possible janitor intervention).
  • Want more info? How to link or How to display code and escape characters are good places to start.
Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others romping around the Monastery: (4)
As of 2024-04-25 04:43 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found