coder57 has asked for the wisdom of the Perl Monks concerning the following question:

I am trying to use a proxy with mechanize, I understand that methods used in LWP::Useragent should work with WWW::Mechanize, so far, I only get the first page or get an "uninitialised value" when I try to get any page with a proxy with WWW::Mechanize, or only get the first page and no more succeeding pages. I understand that for proxies with LWP::Useragent my $ua = LWP::Useragent->new; $ua->proxy('http' => 'proxy'); I have tried it with Mechanize as  my $mech = WWW::Mechanize->new(); $mech->proxy('http','proxy'); The pages are not https, and I do not understand why the same proxy works fine in ie6/firefox but not with any perl script, I did try LWP::Debug qw(+ -conns) and received the error message "simple resonse not implemented", I also used use LWP::Useragent::ProxyAny, and the method set_proxy_by_name(...); for LWP and for Mechanize, still the same error msg, I am out of ideas as to how to getting a proxy to work with any perl script!

Replies are listed 'Best First'.
Re: Proxy with mechanize
by rpanman (Scribe) on Aug 02, 2007 at 13:49 UTC
    Hi coder57

    Can you supply some more of the LWP::Debug output (use the readmore tag if the output is huge)? There should be a line somewhere saying "Proxied to" and maybe even a return code from the page you are trying to fetch.
      This is the code as it stands
      #!/usr/bin/perl use strict; use warnings; use Data::Dumper; use LWP::Useragent; use LWP::Debug qw(- +conns); use WWW::Mechanize; my $total_count = 0; my @keywords = ('simpsons', 'diehard', 'fantastic+four', ); foreach my $keywords(@keywords){ my $url ='http://news.google.co.uk/search?q=~%22'.$keywords.'%22&num=1 +00&hl=en&safe=off&start=0&as_qdr=all&filter=0'; my $mech = WWW::Mechanize->new(); $mech->proxy('http','127.0.0.1:8088'); $mech->get($url); print $mech->uri."\n"; my @links_to_check = grep { $_->url() !~ /google/i} $mech->find_all +_links( url_regex => qr/\./i ); foreach my $links_to_check (@links_to_check) { $total_count++; print "$links_to_check \n"; } } print " $total_count news items found \n";
      As it is I get no LWP::Debug out put just: Use of uninitialized value in concatenation (.) or string at movien1.pl line 28. Use of uninitialized value in concatenation (.) or string at movien1.pl line 28. Use of uninitialized value in concatenation (.) or string at movien1.pl line 28. Ofcourse without the proxy I receive a lot of results. http://news.google.co.uk/search?q=~%22simpsons%22&num=100&hl=en&safe=off&start=0 &as_qdr=all&filter=0 WWW::Mechanize::Link=ARRAY(0x2a01214) WWW::Mechanize::Link=ARRAY(0x2a00174) WWW::Mechanize::Link=ARRAY(0x2a002cc) WWW::Mechanize::Link=ARRAY(0x29fcd0c) WWW::Mechanize::Link=ARRAY(0x2a2d560) WWW::Mechanize::Link=ARRAY(0x2a2d458) WWW::Mechanize::Link=ARRAY(0x2a2d5cc) WWW::Mechanize::Link=ARRAY(0x29ffe80) WWW::Mechanize::Link=ARRAY(0x29fffdc) WWW::Mechanize::Link=ARRAY(0x2a2dbe8) WWW::Mechanize::Link=ARRAY(0x29fffc4) WWW::Mechanize::Link=ARRAY(0x29f363c) WWW::Mechanize::Link=ARRAY(0x2a00168) WWW::Mechanize::Link=ARRAY(0x29ff280) WWW::Mechanize::Link=ARRAY(0x2a2d578) WWW::Mechanize::Link=ARRAY(0x2a2d320) WWW::Mechanize::Link=ARRAY(0x2a00398) WWW::Mechanize::Link=ARRAY(0x2a005cc) WWW::Mechanize::Link=ARRAY(0x2a000cc) WWW::Mechanize::Link=ARRAY(0x2a2dc9c) WWW::Mechanize::Link=ARRAY(0x2a2dcb4) -- More --
        To get the debug to show everything do the following:
        use LWP::Debug qw(+);
        I have tidied up the code a bit... and changed your proxy line (by adding 'http://' to the proxy address). This now works from behind my proxy server and I guess is the solution to your problem.
        #!/usr/bin/perl use strict; use warnings; use Data::Dumper; use LWP::UserAgent; use LWP::Debug qw(+); use WWW::Mechanize; my $total_count = 0; my @keywords = ('simpsons', 'diehard', 'fantastic+four', ); foreach my $keywords(@keywords){ my $url ='http://news.google.co.uk/search?q=~%22'.$keywords.'%22&num +=100&hl=en&safe=off&start=0&as_qdr=all&filter=0'; my $mech = WWW::Mechanize->new(); $mech->proxy('http','http://127.0.0.1:8088'); $mech->get($url); print $mech->uri."\n"; my @links_to_check = grep { $_->url() !~ /google/i} $mech->find_a +ll_links( url_regex => qr/\./i ); foreach my $links_to_check (@links_to_check){ $total_count++; print "$links_to_check \n"; } } print " $total_count news items found \n";
        When you print $links_to_check you will get a response like WWW::Mechanize::Link=ARRAY(0x2a01214), this is because you are printing a Mechanize Link object (see WWW::Mechanize::Link), you can dig down a bit to get the url etc and the documentation should help you with that.