How to process each node in an HTML page

nysus has asked for the wisdom of the Perl Monks concerning the following question:

I am trying to analyze an HTML document to determine the elements that are the visually widest elements. I want to visit each node on the page and determine its width, using the element_coordinates method. The method takes a css selector as an argument. So I'm looking for a way to generate css selectors for each node, similar to the way the developer tools in browser's do that.

First question is, can this be done through Mechanize::Chome over port 9222? I'm guessing I would need to learn how to send queries directly through the $mech object over the transport layer. If this is possible, any details would be appreciated.

If that won't work, how can I generate a unique css selector for each node. My initial thought process was:

Take the HTML content and throw it into a tree.
Recurse over the tree and generate an XML element for each node in the tree.
Use the xml elements to construct a unique xpath for each node (I'm not sure how to do this)
Finally, convert each xpath to a selector. I'm not sure how to do this either.

Any help is appreciated. Thanks.

$PM = "Perl Monk's";
$MCF = "Most Clueless ~~Friar~~ ~~Abbot~~ ~~Bishop~~ ~~Pontiff~~ ~~Deacon~~ ~~Curate~~ ~~Priest~~ Vicar";
$nysus = $PM . ' ' . $MCF;
Click here if you love Perl Monks

Comment on How to process each node in an HTML page Download Code

Replies are listed 'Best First'.
Re: How to processing each node in an HTML page by marto (Cardinal) on Apr 05, 2019 at 08:34 UTC
I'd probably just use WWW::Mechanize::Chrome to run a little bit of JavaScript to find the widest element on a rendered page. Update: example JS code(not mine), note that you'll want to alter the selector to be a child element of the body tag. examples/javascript.pl.	[reply]
Re^2: How to processing each node in an HTML page by nysus (Parson) on Apr 05, 2019 at 12:53 UTC
Ah, good call. I had forgotten about that feature of WMC. This would be using the `$mech->eval` method. $PM = "Perl Monk's"; $MCF = "Most Clueless ~~Friar~~ ~~Abbot~~ ~~Bishop~~ ~~Pontiff~~ ~~Deacon~~ ~~Curate~~ ~~Priest~~ Vicar"; $nysus = $PM . ' ' . $MCF; Click here if you love Perl Monks	[reply] [d/l]