Re: Extracting content from html

First let's clarify your objective. From what I understand of both your introduction and subsequent codes:

Enter query term

Attempt to download input file from known server/directory

If not 404, parse html for desired link

Attempt to download parsed link

If not 404, parse html for desired content

You seem to be able to acquire the html page, although you presume it's not 404; however in order to properly comment as to why your HTML::TokeParser::Simple code doesn't work you'll have to elaborate both the content you're trying to access ("the necessary data from specific actor" is particluarly vague) as well as the bounding HTML content.

I'd further assert that while descending the HTML structure will work, it may break if the site should be redesigned. Depending on what you're trying to accomplish, it may be easier to use a regex.

From what I can make of your example code, you're working with a HTML file of the form:

<html><head></head><body>
<div class=wrapper>
<a href="http://sub.domain.tld/folder/page.html">link text</a>
</div>
</body></html>
[download]

which links to a page of the form:

<html><head></head><body>
<table>
<tr><td class=quote>To be or not to be...</td></tr>
</table>
</body></html>
[download]

The appearant 'dt' vs 'td' typo aside, I'd still need to know what criteria you're trying to employ to select which actor, which quote, et all...

Comment on Re: Extracting content from html Select or Download Code