Re^3: getting content of an https website

Replies are listed 'Best First'.
Re^4: getting content of an https website by Aldebaran (Curate) on Sep 01, 2015 at 07:54 UTC
Thanks AM, I got pretty far with this: use strict; use warnings; use feature 'say'; use HTML::Display; use LWP::UserAgent; my $url = 'https://berniesanders.com/issues/racial-justice/'; my $ua = LWP::UserAgent->new(); $ua->agent( 'Windows Mozilla'); my $response = $ua->get($url); my $content = $response->content; $ENV{'PERL_HTML_DISPLAY_COMMAND'}='run "C:\Program Files (x86)\Googl +e\Chrome\Application\chrome.exe" %s'; my $browser=HTML::Display->new(); if (defined($browser)) { $browser->display(html=>$content); } else { print("Unable to open browser: $@\n"); } [download] Almost everything gets displayed except the big banner on top and some stylized words at the bottom. The links with absolute urls work, but there seems to be some clunkiness in the forward and back arrows on the browser, when it comes back to the original. And what is the original? In the url it looks like this: `file:///C:/cygwin64/tmp/9EQdRdu_5w.html` I have trouble deciding how "real" this is at all. Tomorrow, I'll try a different site and see what happens. Thank you.	[reply] [d/l] [select]

Replies are listed 'Best First'.

Re^4: getting content of an https website
by Aldebaran (Curate) on Sep 01, 2015 at 07:54 UTC

Thanks AM, I got pretty far with this:

  use strict;
  use warnings;
  use feature 'say';
  use HTML::Display;
  use LWP::UserAgent;

  my $url = 'https://berniesanders.com/issues/racial-justice/';
  my $ua  = LWP::UserAgent->new();
  $ua->agent( 'Windows Mozilla');
  my $response = $ua->get($url);
  my $content  = $response->content;
  $ENV{'PERL_HTML_DISPLAY_COMMAND'}='run "C:\Program Files (x86)\Googl
+e\Chrome\Application\chrome.exe" %s';
  my $browser=HTML::Display->new();
  if (defined($browser)) {
    $browser->display(html=>$content);
  }
  else {
    print("Unable to open browser: $@\n");
  }
[download]

Almost everything gets displayed except the big banner on top and some stylized words at the bottom. The links with absolute urls work, but there seems to be some clunkiness in the forward and back arrows on the browser, when it comes back to the original. And what is the original? In the url it looks like this:

file:///C:/cygwin64/tmp/9EQdRdu_5w.html

I have trouble deciding how "real" this is at all. Tomorrow, I'll try a different site and see what happens. Thank you.

[reply]
[d/l]
[select]