in reply to Proxy with mechanize

Hi coder57

Can you supply some more of the LWP::Debug output (use the readmore tag if the output is huge)? There should be a line somewhere saying "Proxied to" and maybe even a return code from the page you are trying to fetch.

Replies are listed 'Best First'.
Re^2: Proxy with mechanize
by coder57 (Novice) on Aug 02, 2007 at 15:20 UTC
    This is the code as it stands
    #!/usr/bin/perl use strict; use warnings; use Data::Dumper; use LWP::Useragent; use LWP::Debug qw(- +conns); use WWW::Mechanize; my $total_count = 0; my @keywords = ('simpsons', 'diehard', 'fantastic+four', ); foreach my $keywords(@keywords){ my $url ='http://news.google.co.uk/search?q=~%22'.$keywords.'%22&num=1 +00&hl=en&safe=off&start=0&as_qdr=all&filter=0'; my $mech = WWW::Mechanize->new(); $mech->proxy('http','127.0.0.1:8088'); $mech->get($url); print $mech->uri."\n"; my @links_to_check = grep { $_->url() !~ /google/i} $mech->find_all +_links( url_regex => qr/\./i ); foreach my $links_to_check (@links_to_check) { $total_count++; print "$links_to_check \n"; } } print " $total_count news items found \n";
    As it is I get no LWP::Debug out put just: Use of uninitialized value in concatenation (.) or string at movien1.pl line 28. Use of uninitialized value in concatenation (.) or string at movien1.pl line 28. Use of uninitialized value in concatenation (.) or string at movien1.pl line 28. Ofcourse without the proxy I receive a lot of results. http://news.google.co.uk/search?q=~%22simpsons%22&num=100&hl=en&safe=off&start=0 &as_qdr=all&filter=0 WWW::Mechanize::Link=ARRAY(0x2a01214) WWW::Mechanize::Link=ARRAY(0x2a00174) WWW::Mechanize::Link=ARRAY(0x2a002cc) WWW::Mechanize::Link=ARRAY(0x29fcd0c) WWW::Mechanize::Link=ARRAY(0x2a2d560) WWW::Mechanize::Link=ARRAY(0x2a2d458) WWW::Mechanize::Link=ARRAY(0x2a2d5cc) WWW::Mechanize::Link=ARRAY(0x29ffe80) WWW::Mechanize::Link=ARRAY(0x29fffdc) WWW::Mechanize::Link=ARRAY(0x2a2dbe8) WWW::Mechanize::Link=ARRAY(0x29fffc4) WWW::Mechanize::Link=ARRAY(0x29f363c) WWW::Mechanize::Link=ARRAY(0x2a00168) WWW::Mechanize::Link=ARRAY(0x29ff280) WWW::Mechanize::Link=ARRAY(0x2a2d578) WWW::Mechanize::Link=ARRAY(0x2a2d320) WWW::Mechanize::Link=ARRAY(0x2a00398) WWW::Mechanize::Link=ARRAY(0x2a005cc) WWW::Mechanize::Link=ARRAY(0x2a000cc) WWW::Mechanize::Link=ARRAY(0x2a2dc9c) WWW::Mechanize::Link=ARRAY(0x2a2dcb4) -- More --
      To get the debug to show everything do the following:
      use LWP::Debug qw(+);
      I have tidied up the code a bit... and changed your proxy line (by adding 'http://' to the proxy address). This now works from behind my proxy server and I guess is the solution to your problem.
      #!/usr/bin/perl use strict; use warnings; use Data::Dumper; use LWP::UserAgent; use LWP::Debug qw(+); use WWW::Mechanize; my $total_count = 0; my @keywords = ('simpsons', 'diehard', 'fantastic+four', ); foreach my $keywords(@keywords){ my $url ='http://news.google.co.uk/search?q=~%22'.$keywords.'%22&num +=100&hl=en&safe=off&start=0&as_qdr=all&filter=0'; my $mech = WWW::Mechanize->new(); $mech->proxy('http','http://127.0.0.1:8088'); $mech->get($url); print $mech->uri."\n"; my @links_to_check = grep { $_->url() !~ /google/i} $mech->find_a +ll_links( url_regex => qr/\./i ); foreach my $links_to_check (@links_to_check){ $total_count++; print "$links_to_check \n"; } } print " $total_count news items found \n";
      When you print $links_to_check you will get a response like WWW::Mechanize::Link=ARRAY(0x2a01214), this is because you are printing a Mechanize Link object (see WWW::Mechanize::Link), you can dig down a bit to get the url etc and the documentation should help you with that.
        I changed it to LWP::Debug qw(+); and the proxy to $mech->proxy('http','http://127.0.0.1:8088'); but I am still getting the same problems, eg: LWP::UserAgent::new: () LWP::UserAgent::proxy: http http://127.0.0.1:8088 LWP::UserAgent::request: () HTTP::Cookies::add_cookie_header: Checking news.google.co.uk for cookies HTTP::Cookies::add_cookie_header: Checking .google.co.uk for cookies HTTP::Cookies::add_cookie_header: Checking google.co.uk for cookies HTTP::Cookies::add_cookie_header: Checking .co.uk for cookies HTTP::Cookies::add_cookie_header: Checking co.uk for cookies HTTP::Cookies::add_cookie_header: Checking .uk for cookies LWP::UserAgent::send_request: GET http://news.google.co.uk/search?q=~%22fantasti c+four%22&num=100&hl=en&safe=off&start=0&as_qdr=all&filter=0 LWP::UserAgent::_need_proxy: Proxied to http://127.0.0.1:8088 LWP::Protocol::http::request: () LWP::Protocol::collect: read 130 bytes LWP::UserAgent::request: Simple response: Not Found Use of uninitialized value in concatenation (.) or string at movien1.pl line 28. 0 news items found