Re: LWP hit button to continue
by haj (Vicar) on Mar 23, 2021 at 15:30 UTC
|
LWP doesn't hit buttons - it just fetches the page and leaves it up to you to parse and process it.
But, of course, others have done the heavy lifting for you: Check out WWW::Mechanize, which is built on top of LWP, and has a click method to simulate clicking the button.
| [reply] |
|
|
Actually it is a button to agree to be over 18, so knowing the next link is no use. Some sort of permission from the site seems to be required.
| [reply] |
|
|
I can't help you there: I don't even know whether you are over 18 :)
| [reply] |
|
|
Yes, I have been there, but being a beginner I did not understand how to get that working. Would it be possible to give a code that would work on the given example?
| [reply] |
|
|
It's actually not that hard. I missed in your example HTML that the button is just "decoration" for the a element (which is missing a closing quote after https://something), so you don't need to "click" a button, you need to follow a link.
Note that $url should be provided by someone.
use WWW::Mechanize;
$ua = WWW::Mechanize->new;
my $response = $ua->get($url);
my $next_response = $ua->follow_link(text_regex => qr/continue.../);
I missed in your example that the button is just "decoration" for the a element (which is missing a closing quote after https://something), so you don't need to "click" a button, you need to follow a link.
| [reply] [d/l] |
|
|
| [reply] |
|
|
|
|
|
|
Re: LWP hit button to continue
by hippo (Archbishop) on Mar 23, 2021 at 15:26 UTC
|
| [reply] |
|
|
The link was parsed as seen in the given html: "https://something".
Problem is hitting the button: to agree to continue.
| [reply] |
|
|
#!/usr/bin/perl
use strict;
use warnings;
use WWW::Mechanize;
my $mech = WWW::Mechanize->new();
$mech->get('https://derpderpderp.com'); # fake URL
$mech->click_button( number => 2 ) # click the second button on fake p
+age
#do whatever you want with the next page, e.g. print the title
print $mech->title;
| [reply] [d/l] |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Re: LWP hit button to continue
by bliako (Abbot) on Mar 23, 2021 at 19:44 UTC
|
You have some options. But first make sure that your $ua accepts cookies and also has an acceptable user-agent string:
use HTTP::CookieJar::LWP ();
$ua = LWP::UserAgent->new;
my $jar = HTTP::CookieJar::LWP->new;
$ua->cookie_jar($jar);
$ua->agent('underage bot dont let me in!!!'); # <<< change that
Bonus: using HTTP::Cookies::Mozilla can be helpful if you ever need to load firefox's cookie jar.
| [reply] [d/l] [select] |
Re: LWP hit button to continue
by perlfan (Parson) on Mar 25, 2021 at 17:26 UTC
|
Not what you asked for, but I really like Firefox::Marionette and have been really wanting to try Playwright. The latter necessarily uses the capabilities used in the former to facilitate cross browser support (Playwright is at its core a node based tool, the Perl module interacts with it; and the Playwright module doesn't use the FF module). It's more of a selenium replacement, but provided you can spare the cycles to spin up a sandboxed headless firefox or chrome, then this might open another option to you. edited former/latter derp | [reply] |
Re: LWP hit button to continue
by Anonymous Monk on Mar 23, 2021 at 19:23 UTC
|
The problem was solved without clicking the button!
In perl I had to open https://example.com/site/content.htm
Then I got a page with a button that pointed with a href to that same link.
I had already tried to open that page again without the button, but I got the page with the button back again. Caught in a loop.
The trick I applied was getting an upward link: https://example.com/site/
This gave me the same page with the button and te same href.
But now opening that href worked!
No need for mechanize to click the button anymore.
But I will try the solution given to that as well.
| [reply] |