in reply to Re^2: WWW::Mechanize::PhantomJS can't click button
in thread WWW::Mechanize::PhantomJS can't click button

as soon as the previous action completes,

Welcome to the world of asynchronous programming.

Which is further complicated by the fact that you are doing all by proxy and the AJAX calls on the server side which often leave the browser in the same page, so you can't even inject javascript to check for page.onLoadFinished or document.readyState as the page is already loaded. But there are hacks on a per-site basis to help you achieve what you want. For example, for your specific case I have noticed that when click() happens you are either redirected to a new URL or a textbox fills in with an error message. Here is your answer then:

use strict; use warnings; use WWW::Mechanize::PhantomJS; my $mech = WWW::Mechanize::PhantomJS->new(); $mech->get('https://profile.ccli.com/account/signin?appContext=OLR&ret +urnUrl=https%3A%2F%2Freporting.ccli.com%2F'); $mech->field( EmailAddress => 'me@test.com' ); $mech->field( Password => 'mypw' ); my $ori_uri = $mech->uri; print "clicking from '$ori_uri' ... \n"; my $x = $mech->click_button( id => 'sign-in' ); my $maxtime = 50; # do this for a max of 50 seconds my $success; while($maxtime-->0){ print "checking if uri has changed from original : '".$mech->uri." +' ...\n"; if( $mech->uri ne $ori_uri ){ $success = 1; last } # similarly check for the contents of the error textbox otherwise sleep(1); } die "something went wrong and never left the page..." unless $success; print "finally left and now in this uri : ".$mech->uri."\n"; $mech->render_content( format => 'png', filename => 'ccli_login.png' ); print "done!\n";

Personally, I rarely, if ever, use Mechanize for scraping. Instead, I first try checking the http requests on the site and then emulate them using LWP::UserAgent. Even the most convoluted javascript-driven, ajax-calling click() will eventually resort to some POST or GET which you can grab by opening the web-devolpemnt-tools on Firefox and observe the Network tab. (I have even managed to do that with a "Microsoft Power BI" site which is the epitomy of twisted perversion conceived by a mind descendent by a lock between the von Masoch and Marquis de Sade families.). But there is now an increasing number of sites which offer a (REST) API to their services which you can harvest again with LWP::UserAgent or other similar e.g. Mojo::UserAgent.

And that brings us to your "screenshot". If you insist on getting a screenshot (i.e. render the html received) then it will be very difficult to do that with this technique. Because this technique provides you with the page's content (as HTML+JS or JSON) but does not render it neither it runs any JS. So you will probably be able to get a "last-login-time" field out of the data, or a picture of your avatar but you will not, most likely, be able to render that HTML you received unless it is some straightforward case. But the HTML will contain all you need which you can parse using a DOM parser, e.g. HTML::TreeParser or Mojo::DOM.

bw, bliako

Replies are listed 'Best First'.
Re^4: WWW::Mechanize::PhantomJS can't click button
by tel2 (Pilgrim) on Apr 29, 2021 at 23:28 UTC
    Thanks again for all that, bliako.

    I don't really need to do a screen shot.  That was just an attempt at an easy way to see what had happened.

    I tried your code, and its timing seems to work, thank you!  So that's the timing sorted.

    However, I still have problems:
    - If I try to login with a dummy email/password (as we've used above), I should get the error message "Email or password not found."...etc, but I don't.  The screen shot shows "Please try again later."..etc, with the email address selected (i.e. blue background).
    - If I try to login with a valid email/password, I get the same result as the dummy email/password gives.  It should be logging in and taking me to "https://profile.ccli.com/".
    - This "Please try again later."..etc, is the same message I get when I manually try to login with valid or invalid email/password when I have JavaScript disabled in Firefox.

    Any ideas what's causing the above problems and how I can resolve them?

    Re using Mechanize for scraping, I used to use WWW::Mechanize (not WWW::Mechanize::PhantomJS) for logging in to this same site, and I used Firefox's dev tools to grap the POST info to do that, but last year they changed the site and I couldn't see how to get that working anymore (perhaps just my lack of understanding), and they seemed to be requiring JavaScript, so I installed phantomjs and WWW::Mechanize::PhantomJS.  If I manually go to the site with Firefox with JavaScript disabled, there's a message at the top "JavaScript is disabled. Please enable to use the site.".  Can I ignore that message when it comes to my code, or do you think they might be blocking non-JavaScript agents?

    (BTW, here's a shorter URL that I've just realised we can use for this: https://profile.ccli.com/Account/SignIn).

      You will have better results when you set a user-agent string and/or a file for persisting cookies like so:

      my $mech = WWW::Mechanize::PhantomJS->new( 'cookie_file' => 'cookies.txt' ); $mech->add_header( # use a proper string from somewhere or your own browser 'User-Agent' => 'Mozilla/5.0 ... bla bla bla' );

      I said "and/or cookie file" because I could not find the cookie file on current dir - i don't know where it may be stored if at all

      Regarding POSTing with LWP, it seems viable.

      bw, bliako

        Thanks again, bliako.  I might try that sometime.  I seem to recall trying cookies last year when I was trying to get this working, but I should make sure and also try changing the user-agent as you've suggested.