Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

Hello,
Being a beginner on perl I struggle to hit a button using LWP and store the next page.

The button looks like this:

<div style="padding:18px;"> <a href="https://something><button class="button name">continue...</bu +tton></a> </div>

perl:
#---------
use HTTP::Request::Common qw(GET POST);
use LWP::UserAgent ();
$ua = LWP::UserAgent->new;

$request = GET $url;
$response = $ua->simple_request($request);
$page = $response->content;
#----------

Then $page contains the button to continue to href.
How do I get LWP to hit the button and store the next page?

Replies are listed 'Best First'.
Re: LWP hit button to continue
by haj (Vicar) on Mar 23, 2021 at 15:30 UTC

    LWP doesn't hit buttons - it just fetches the page and leaves it up to you to parse and process it.

    But, of course, others have done the heavy lifting for you: Check out WWW::Mechanize, which is built on top of LWP, and has a click method to simulate clicking the button.

      Actually it is a button to agree to be over 18, so knowing the next link is no use. Some sort of permission from the site seems to be required.
        I can't help you there: I don't even know whether you are over 18 :)
      Yes, I have been there, but being a beginner I did not understand how to get that working.
      Would it be possible to give a code that would work on the given example?

        It's actually not that hard. I missed in your example HTML that the button is just "decoration" for the a element (which is missing a closing quote after https://something), so you don't need to "click" a button, you need to follow a link.

        Note that $url should be provided by someone.

        use WWW::Mechanize; $ua = WWW::Mechanize->new; my $response = $ua->get($url); my $next_response = $ua->follow_link(text_regex => qr/continue.../);

        I missed in your example that the button is just "decoration" for the a element (which is missing a closing quote after https://something), so you don't need to "click" a button, you need to follow a link.

Re: LWP hit button to continue
by hippo (Archbishop) on Mar 23, 2021 at 15:26 UTC
      The link was parsed as seen in the given html: "https://something".
      Problem is hitting the button: to agree to continue.

        Quick and dirty example using WWW::Mechanize to get a fake page, click a button, print the title of the next page:

        #!/usr/bin/perl use strict; use warnings; use WWW::Mechanize; my $mech = WWW::Mechanize->new(); $mech->get('https://derpderpderp.com'); # fake URL $mech->click_button( number => 2 ) # click the second button on fake p +age #do whatever you want with the next page, e.g. print the title print $mech->title;
Re: LWP hit button to continue
by bliako (Abbot) on Mar 23, 2021 at 19:44 UTC

    You have some options. But first make sure that your $ua accepts cookies and also has an acceptable user-agent string:

    use HTTP::CookieJar::LWP (); $ua = LWP::UserAgent->new; my $jar = HTTP::CookieJar::LWP->new; $ua->cookie_jar($jar); $ua->agent('underage bot dont let me in!!!'); # <<< change that

    Bonus: using HTTP::Cookies::Mozilla can be helpful if you ever need to load firefox's cookie jar.

Re: LWP hit button to continue
by perlfan (Parson) on Mar 25, 2021 at 17:26 UTC
    Not what you asked for, but I really like Firefox::Marionette and have been really wanting to try Playwright. The latter necessarily uses the capabilities used in the former to facilitate cross browser support (Playwright is at its core a node based tool, the Perl module interacts with it; and the Playwright module doesn't use the FF module). It's more of a selenium replacement, but provided you can spare the cycles to spin up a sandboxed headless firefox or chrome, then this might open another option to you. edited former/latter derp
Re: LWP hit button to continue
by Anonymous Monk on Mar 23, 2021 at 19:23 UTC
    The problem was solved without clicking the button!

    In perl I had to open https://example.com/site/content.htm
    Then I got a page with a button that pointed with a href to that same link.
    I had already tried to open that page again without the button, but I got the page with the button back again. Caught in a loop.

    The trick I applied was getting an upward link: https://example.com/site/

    This gave me the same page with the button and te same href.
    But now opening that href worked!
    No need for mechanize to click the button anymore.
    But I will try the solution given to that as well.