2mths has asked for the wisdom of the Perl Monks concerning the following question:

I would like to write a script that accesses an internet page (and then does something with it). Using examples from Perl Cookbook and perldoc lwpcook I can do this for intranet pages.

However to achieve full functionality I need to get through a proxy server to the internet. I know nothing about the proxy other than I have to authenticate with it (fill in the box that pops up with my userID and password).

The only other clue I have is that PPM (Perl Package Manager - Yes I'm afraid I'm in M$ land) uses environmental variables HTTP_Proxy, HTTP_Proxy_User and HTTP_Proxy_Pass to get through the proxy.

Can anyone offer a clue or point me in the direction of possible wisdom.

TIA
  • Comment on Access the Internet though a Proxy with Perl

Replies are listed 'Best First'.
Re: Access the Internet though a Proxy with Perl
by jasonk (Parson) on Feb 13, 2003 at 17:21 UTC

    LWP::UserAgent documentation includes a description of how to configure proxy support.

    $ua->proxy(['http', 'ftp'], 'http://proxy.sn.no:8001/'); $ua->proxy('gopher', 'http://proxy.sn.no:8001/');
      Thankyou for taking the time to respond. I should have made my initial post clearer (but was trying to keep it brief in case I was wasting people's time).

      I have been able to configure proxy support and tested it by referencing a proxy that does not require authentication.

      Unfortunately the 'production' script will not sit as I do in a DMZ but in a 'production' network where the only Internet access is through the aforementioned and troublesome proxy.

      I use "production" in a sense that colourfully described would see me branded as horizontally challenged-ist.
Re: Access the Internet though a Proxy with Perl
by phydeauxarff (Priest) on Feb 13, 2003 at 17:25 UTC
    I believe your question relates to writting code that can get past a proxy rather than trying to use PPM through a proxy.
    That being the case, take a look at the CPAN module, Net::HTTPTunnel.
    I believe it will get you started and has references for more reading as a bonus
      Another much appreciated response.
      It took me a little longer to reply to this than the others because before I could follow it up my employer actually wanted me to do some work.

      This was a very interesting thing to look up and has opened my eyes if not solved my problem. I have played with the example as listed in the documentation on CPAN. I think I am successfully authenticating with the proxy because if I do something like...
      if ( Net::HttpTunnel stuff ) {print "Worked"}
      ...it does work if I put a valid user ID and password in and doesn't if I make something up. However I can't work out how to use this to make the jump to getting through the proxy to read a webpage.

      So much to learn, so little brainpower and time to do it with.
Re: Access the Internet though a Proxy with Perl
by dws (Chancellor) on Feb 13, 2003 at 18:46 UTC
    To expand on jasonk's answer, the environment variables that work for PPM work because they're actually honored by LWP.pm, which PPM uses for making HTTP requests. You can use LWP directly.

    If you're using ActiveState Perl, you'll find "lwpcook" (the LWP Cookbook) in the online documentation. It discusses using LWP to get through proxy servers, and gives examples.

      Again - Thankyou for taking the time to reply.

      I have and continue to reference the info found from "perldoc lwpcook" which I belive you suggest looking at. However I have had little success on the subject of authentication. The documentation being perhaps a little too brief for a novice such as myself. It was the use of the phrase "should be able" that prompted me to suspect more info might be available and 'out there'.

      Also the perldoc entry for LWP whilst referencing the use of the environmental variable HTTP_Proxy does not mention the _User and _Pass environmental variables. Still until your post I didn't realise LWP had it's own separate perldoc info and having read it it references some more things that I shall now proceed to check out.