Beefy Boxes and Bandwidth Generously Provided by pair Networks
Pathologically Eclectic Rubbish Lister
 
PerlMonks  

Re: BBC4 Radio Schedules and LWP:UserAgent problem

by hippo (Bishop)
on Apr 07, 2018 at 13:51 UTC ( [id://1212483]=note: print w/replies, xml ) Need Help??


in reply to BBC4 Radio Schedules and LWP:UserAgent problem

You do not need to specify ssl_opts if your URL is not https. That way your code will run without many of the SSL-specific dependencies.

For how to install modules, start with A Guide to Installing Modules. If you want to see the full tree of dependencies, take a look at http://deps.cpantesters.org/

Replies are listed 'Best First'.
Re^2: BBC4 Radio Schedules and LWP:UserAgent problem
by mr_ron (Chaplain) on Apr 07, 2018 at 15:02 UTC

    The http://www.bbc... URL redirects (301 code - permanently) to an https://www.bbc... URL, so he likely does need SSL.

    Ron
Re^2: BBC4 Radio Schedules and LWP:UserAgent problem
by Anonymous Monk on Apr 07, 2018 at 15:17 UTC
    Thank your for the links about installing modules and getting dependencies. From your comments, I tried modifying the line that 'called' UserAgent to
    my $ua = LWP::UserAgent->new();
    However, I again got the same error message. I suspect I did not do all that is required and I regret to say I do not understand the implication of your comment "That way your code will run without many of the SSL-specific dependencies."
    I would appreciate some more clues about what I should do.
    The first lines of the 'page-source' of the URL in the Perl test code is next - it seems to be HTML
    <!DOCTYPE html>
    <html class="b-header--black--white b-footer--black--white " lang="en-GB">
    <head>
    I am more than happy to try and load the required Perl modules but I am struggling to know where to start.
      I do not understand the implication of your comment "That way your code will run without many of the SSL-specific dependencies."

      So, LWP::UserAgent is there to give you a client for accessing web resources which may be reached either via HTTP or HTTPS. The latter requires SSL (or TLS) and those require lots of extra code in the form of crypto libraries and so forth. That's what your error message is talking about. If the only web resources you are trying to access are over HTTP then you don't need those extra libraries, modules and so on. Note that I'm grossly simplifying here to keep it understandable.

      However, while the sample code you provided lists only an HTTP URL, mr_ron has pointed out that this merely redirects to an HTTPS URL and therefore you do in fact need all the extra code in order to get to the end resource which requires HTTPS. Still with us?

      Now, here's some sample code using the real, end-point URL explicitly:

      #!/usr/bin/env perl use strict; use warnings; use LWP::UserAgent; my $url = 'https://www.bbc.co.uk/radio4/programmes/schedules/fm/2015/1 +0/13'; my $ua = LWP::UserAgent->new (); my $res = $ua->get( $url ); my $html = $res->content; print substr ($html, 0, 256) . "...\n";

      which produces this output:

      $ perl getr4.pl <!DOCTYPE html> <html class="b-header--black--white b-footer--black--white " lang="en- +GB"> <head> <meta charset="UTF-8"> <title>BBC Radio 4 FM - Schedules, Tuesday 13 October 2015</ti +tle> <link rel="icon" href="https://www.bbc.c... $

      This is using perl 5.20.3 and LWP::UserAgent 6.15, Mozilla::CA 20141217, LWP::Protocol::https 6.06, There are alternatives, but you can start with these. Try installing suitably recent versions of these modules using the documentation you have already read. You may need to install other dependencies too. Good luck.

        Thank you for that.
        I have been looking more at the example which failed because LWP::Protocol::https module is required.
        This failure message is only given when the final line of trying to print $html is included in the Perl.
        $html is set equal to $res->content. Printing the variable $res I get the text HTTP::Response=HASH(0x54d8608)

        As I only want to read the contents of the original web page, can I simply read the 'content' and therefore store the data in a hash? If so how is this done?

        If this can be done if may mean I do not have to find out how to install the missing modules and their dependencies.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://1212483]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others goofing around in the Monastery: (5)
As of 2024-03-28 16:25 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found