in reply to LWP::Useragent doesn't work on certain HTTPS websites?

This is a common problem for sites which use the bot manager offered by Akamai CDN to protect against being crawled by bots:
$ dig www.target.com.au
...
www.target.com.au.      6886    IN      CNAME   shop.target.com.au.edgekey.net.
shop.target.com.au.edgekey.net. 6886 IN CNAME   e1380.x.akamaiedge.net.
The bot manager detects bots depending on specific traits. Currently the seems to be that Accept-Encoding and Accept-Language are set, that the User-Agent is something like Mozilla/5.0 and that Connection is Keep-Alive. If these conditions are not met the client is treated as a bot, which might result in hanging or error messages. The Connection header is automatically set by LWP to the expected value but the others need to be set explicitly:
use warnings;
use strict;
use LWP::UserAgent;

my $ua = LWP::UserAgent->new();
my $res = $ua->get('https://www.target.com.au/',
    'Accept-Language' => 'en-US',
    'User-Agent' => 'Mozilla/5.0',
    'Accept-Encoding' => 'identity',
);
print $res->content;
See also Golang Http Get Request very slow, Strange CURL issue with a particular website SSL certificate, Scraping attempts getting 403 error or Requests SSL connection timeout over at stackoverflow.com for similar problems.
  • Comment on Re: LWP::Useragent doesn't work on certain HTTPS websites?

Replies are listed 'Best First'.
Re^2: LWP::Useragent doesn't work on certain HTTPS websites?
by sectokia (Friar) on Feb 15, 2019 at 11:51 UTC
    Thanks... I figured it had to be headers since it worked in a web-browser. I actually tried to copy the headers that were sent by chrome when the page was fetched, but I missed accept-language...