Re: WWW::Mechanize::Chrome VERY slow on xpath obtaining TDs of a TR

I had a quick glimpse into the docs of ->xpath

and found this passages and emphasized two parts

$mech->xpath( $query, %options )

my $link = $mech->xpath('//a[id="clickme"]', one => 1);
# croaks if there is no link or more than one link found
my @para = $mech->xpath('//p');
# Collects all paragraphs
my @para_text = $mech->xpath('//p/text()', type => $mech->xpathResult('STRING_TYPE'));
# Collects all paragraphs as text
...

node - node relative to which the query is to be executed. Note that you will have to use a relative XPath expression as well. Use

.//foo
instead of
//foo
Querying relative to a node only works for restricting to children of the node, not for anything else. This is because we need to do the ancestor filtering ourselves instead of having a Chrome API for it.

two insights into potential bottlenecks so:

the module has to identify the parent itself, instead of assembling an xpath. Putting all into one path by yourself might be far more efficient (and probably your identifier is not as unambiguous as you thought)
you might get expensive wrapper objects for each result, unless you specify a type of text

Of course this is all speculation as long as you can't provide an SSCCE ... :)

Cheers Rolf
_{(addicted to the 𐍀𐌴𐍂𐌻 Programming Language :)

Wikisyntax for the Monastery}

Comment on Re: WWW::Mechanize::Chrome VERY slow on xpath obtaining TDs of a TR Select or Download Code

Replies are listed 'Best First'.
Re^2: WWW::Mechanize::Chrome VERY slow on xpath obtaining TDs of a TR by ait (Hermit) on Nov 27, 2022 at 10:27 UTC
After adding HTML::Tree and parsing some stuff in pure Perl land I think that IS actually the right approach: Use W::M::Chrome for JS rendering, JS interactions and high-level xpath Slurp HTML chunks and process in the Perl side as much as possible	[reply]
Re^3: WWW::Mechanize::Chrome VERY slow on xpath obtaining TDs of a TR (updated) by LanX (Saint) on Nov 27, 2022 at 10:38 UTC
That's one approach. But as I said I think putting the logic into a more elaborate xpath to do the heavy lifting inside the browser would fix your performance issue without needing HTML::Tree IMHO your code will force the Perl part in W:M:C to do a lot of own filtering and create thousands of proxy objects. These Perl objects will also tunnel requests back and forth to the browser for most method calls. Hence many potential bottlenecks. update as an illustration, this xpath in chrome's dev console for https://meta.wikimedia.org/wiki/Wikipedia_article_depth returns 1016 strings at once `//table[3]//tr//td//text()` Disclaimer: I don't have W:M:C installed and my xpath foo is rusted, so I'm pretty sure there are even better ways to do it. Cheers Rolf _{(addicted to the 𐍀𐌴𐍂𐌻 Programming Language :) Wikisyntax for the Monastery}	[reply] [d/l]
Re^4: WWW::Mechanize::Chrome VERY slow on xpath obtaining TDs of a TR (updated) by ait (Hermit) on Nov 29, 2022 at 09:15 UTC
True.	[reply]


Welcome to the Monastery
	PerlMonks

Re: WWW::Mechanize::Chrome VERY slow on xpath obtaining TDs of a TR

update