bliako has asked for the wisdom of the Perl Monks concerning the following question:

Esteemed Monks (especially the W::M::C one!)

I am not able to find elements in the DOM which are injected dynamically (e.g. via javascript after page load)

These fail: wait_until_visible(xpath=>...) and xpath('...').

<!-- save me as ./content.html --> <!DOCTYPE html> <html lang="en"> <head></head> <body> <div id='id1'>element in html</div> <div id='dynamic-container'></div> <script> document.addEventListener("DOMContentLoaded", function(){ setTimeout(function(){ var anElem = document.createElement('span'); anElem.setAttribute("id", "id2"); anElem.innerHTML = "appearing after 1.5s"; document.getElementById('dynamic-container').appendChild(anEle +m) },1500); setTimeout(function(){ var dyn = document.getElementById('id2'); var str = "The dynamic element was "+(dyn==null?"NOT":"")+" fo +und by getElementById()"; alert(str); console.log(str); },2000); }); // on dom loaded </script> </body> </html>
# it assumes a ./content.html is present on same dir #!/usr/bin/env perl use strict; use warnings; use WWW::Mechanize::Chrome; use Log::Log4perl qw(:easy); use FindBin; Log::Log4perl->easy_init($ERROR); my %mech_params = ( headless => 0, launch_arg => [ '--window-size=600x800', '--password-store=basic', # do not ask me for stupid chrome ac +count password # '--remote-debugging-port=9223', # '--enable-logging', # see also log above '--disable-gpu', '--no-sandbox', '--ignore-certificate-errors', '--disable-background-networking', '--disable-client-side-phishing-detection', '--disable-component-update', '--disable-hang-monitor', '--disable-save-password-bubble', '--disable-default-apps', '--disable-infobars', '--disable-popup-blocking', ], ); my $mech = WWW::Mechanize::Chrome->new(%mech_params); $mech->get('file://'.$FindBin::Bin.'/content.html'); my $elem; $elem = eval { $mech->xpath('//div[@id="id1"]', single=>1) }; die "non-dynamic element not found" unless defined $elem; $mech->sleep(2); my $ret = eval { $mech->wait_until_visible( xpath=>'//div[@id="id2"]', timeout=>2, # it's already there after 1.5s ); 1; }; die "dynamic element not found" unless defined($ret) && ($ret==1) && ! $@; print "OK dynamic element found!\n";

What I am trying to do is to get notified when a page has finally loaded and settled and when all sort of dynamic HTML elements have been loaded, long after a "DOM-ready" event was fired.

Javascript's getElementById() succeeds. Do I have to poll myself with javascript eval()?

Edit:

I have solved this by following ++LanX's suggestion which is to search/poll HTML elements via javascript. So I am now using something like: do { ($ret, $typ) = $mech->eval($js) } while($ret==0 && !$timeout && $mech->sleep(0.5));. Where $js could be something like this which returns the number of items matched by the specified XPath selector:

document.evaluate('//div[@id="abc"]', null, document.body, null, XPath +Result.UNORDERED_NODE_SNAPSHOT_TYPE, null).snapshotLength;

Thank you Corion for WWW::Mechanize::Chrome

thank you

bw, bliako

Replies are listed 'Best First'.
Re: How to find dynamic DOM elements with WWW::Mechanize::Chrome?
by LanX (Saint) on Apr 11, 2025 at 13:40 UTC
    A long time ago I had similar problems with WWW::Mechanize::Firefox and I remember solving it by injecting my own JS to collect the data.

    Even if you can make "wait_until_visible" work in this case, you'll always find cases were you can't avoid injecting your own JS.

    My 2¢, how they help :)

    Cheers Rolf
    (addicted to the Perl Programming Language :)
    see Wikisyntax for the Monastery

      Fine, I will do that until a high-level solution arrives. Thanks!

        Maybe wait till the module's author had a chance to react ... ;)

        Cheers Rolf
        (addicted to the Perl Programming Language :)
        see Wikisyntax for the Monastery