in reply to Re^5: Redirection with LWPUserAgent
in thread Redirection with LWPUserAgent

Well, my guess we need to modify our "bad" header. We are getting our "bad" header when the initial request is redirected to a some "dummy" location, such locations are pretty common in my case (.../sorry... ; .../Error/...; etc) so we just need to make initial request right from the "bad", for example this is a bad request:

http://www.google.co.uk/sorry/?continue=http://www.google.co.uk/search%3Fq%3Djust+an+example

We can detect it by ../sorry/.. in the middle, and this link will lead us straight to the captcha request page, this is not what we need, we need just:

http://www.google.co.uk/search%3Fq%3Djust+an+example

So we are extracting from the "Location" Header "bad" location header and replacing it with a "good" one. Although I have not much of experince, I tried to write a code, atleast I'd like to belive that it's a code, and not just a mess :). Also I'm stuck with one thing, I can't figure out how to pass additional parameters to a HTTP::Proxy if new() method already used.
#!/usr/bin/perl use strict; use warnings; use HTTP::Proxy qw( :log ); use HTTP::Proxy::HeaderFilter::simple; use LWP::UserAgent; + my $ua = LWP::UserAgent->new(); + + $ua->proxy(['http'],'http://127.0.0.1:29999'); + + $ua->timeout(10); + + $ua->agent('Mozilla/5.0 (Windows NT 6.1) AppleWebKit/534.24 (KHTML, li +ke Gecko) Chrome/11.0.696.60 Safari/534.24'); open ( LOGFILE, ">>", "/var/log/repeater.log"); # ##<----- 2) my $proxy = HTTP::Proxy->new( port => '38374', agent => $ua, logfh => <LOGFILE>, ); #HTTP::Proxy->new(@ARGV); ### <--3) $proxy->logmask( ALL ); $proxy->push_filter( host => 'google.com', # only apply to this domain response => HTTP::Proxy::HeaderFilter::simple->new( sub { my ( $s +elf, $headers, $response ) = @_; # skip non redirects return if $response->code !~ /^3/; # pick up location my $location = $headers->header('Location'); # find bad redirections if ( $location =~ m{google.com/sorry.*} ) { # change the redirect my $new_location = $location ; $new_location =~ s/.*(\/sorry\/\?continue=.*)/$1/gx ; $new_location =~ s/\/sorry\/\?continue=//; $headers->header( Location => $new_location ); # print some logging information $self->proxy->log( ALL, LOCATION => "$location => $new_location" ); } } ) ); $proxy->start;
P.S. Clients are operating through the parent proxies so it makes sense to try to repeat request. Thanks in advance, Sergey.