Re: How to get html code from a secure (https:\\) page?

When I do like this...

#!/usr/bin/env perl
use strict;
use warnings;
use WWW::Curl::Easy;
use Data::Dump;

my $fetch = sub {
    my $curl = WWW::Curl::Easy->new();
    my ( $header, $body );
    $curl->setopt( CURLOPT_URL,            shift );
    $curl->setopt( CURLOPT_WRITEHEADER,    \$header );
    $curl->setopt( CURLOPT_WRITEDATA,      \$body );
    $curl->setopt( CURLOPT_FOLLOWLOCATION, 1 );
    $curl->setopt( CURLOPT_TIMEOUT,        10 );
    $curl->setopt( CURLOPT_SSL_VERIFYPEER, 1 );
    $curl->perform;
    {
        header => $header,
        body   => $body,
        info   => $curl->getinfo(CURLINFO_HTTP_CODE),
        error  => $curl->errbuf,
    };
};

my $result = $fetch->(shift);

dd $result;

__END__
[download]

...i get:

karls-mac-mini:playground karl$ ./curl.pl https://sharecenter.com/Page
+s/Software.aspx
{
  body   => undef,
  error  => "SSL peer certificate or SSH remote key was not OK",
  header => undef,
  info   => 0,
}
[download]

...but with $curl->setopt( CURLOPT_SSL_VERIFYPEER, 0 ); i get:

karls-mac-mini:playground karl$ ./curl.pl https://sharecenter.com/Page
+s/Software.aspx
{
  body   => "<!-- b2 -->",
  error  => "",
  header => "HTTP/1.0 200 OK\r\nDate: Mon, 22 Jan 2018 09:47:51 GMT\r\
+nServer: Apache/2.2.22\r\nExpires: Mon, 26 Jul 1997 05:00:00 GMT\r\nL
+ast-Modified: Mon, 22 Jan 2018 09:47:51 GMT\r\nCache-Control: no-stor
+e, no-cache, must-revalidate\r\nCache-Control: post-check=0, pre-chec
+k=0\r\nPragma: no-cache\r\nSet-Cookie: tu=dc6816b4e45149c7421e46e3905
+2dfef; expires=Tue, 31-Dec-2019 23:00:00 GMT; Max-Age=61218729; path=
+/; domain=sharecenter.com; httponly\r\nX-Adblock-Key: MFwwDQYJKoZIhvc
+NAQEBBQADSwAwSAJBANnylWw2vLY4hUn9w06zQKbhKBfvjFUCsdFlb6TdQhxb9RXWXuI4
+t31c+o8fYOv/s8q1LGPga3DE1L/tHU4LENMCAwEAAQ==_heva/qNbVoSrOKfx6K0UI/De
+onTq8ke19pivgTgrL2w9ZtF3/lPIuu2AIlia5FA69jmNJzQb9Afod5WU1oglQw==\r\nV
+ary: Accept-Encoding\r\nContent-Length: 11\r\nContent-Type: text/html
+; charset=UTF-8\r\nX-Cache: MISS from 110132\r\nConnection: close\r\n
+\r\n",
  info   => 200,
}
[download]

I don't know if this helpful and is what you expected.

Best regards, Karl

ŤThe Crux of the Biscuit is the Apostropheť

perl -MCrypt::CBC -E 'say Crypt::CBC->new(-key=>'kgb',-cipher=>"Blowfish")->decrypt_hex($ENV{KARL});'Help

Comment on Re: How to get html code from a secure (https:\\) page? Select or Download Code

Replies are listed 'Best First'.
Re^2: How to get html code from a secure (https:\\) page? by noxxi (Pilgrim) on Jan 22, 2018 at 12:16 UTC
You also get an error if you visit the site with a browser: NET::ERR_CERT_COMMON_NAME_INVALID. This is because the certificate is issued for cc.sedoparking.com which is a domain parking service. Looking at the http instead of https URL you'll see that the site is actually for sale. This suggests that the resource you want to access is no longer available under this URL.	[reply]
Re^3: How to get html code from a secure (https:\\) page? by karlgoethebier (Abbot) on Jan 22, 2018 at 17:50 UTC
Mmh, as the OP wrote: "...check the broken links..."? If this is what he meant and i guessed... ŤThe Crux of the Biscuit is the Apostropheť `perl -MCrypt::CBC -E 'say Crypt::CBC->new(-key=>'kgb',-cipher=>"Blowfish")->decrypt_hex($ENV{KARL});'`Help	[reply] [d/l]