Well, my guess we need to modify our "bad" header. We are getting our "bad" header when the initial request is redirected to a some "dummy" location, such locations are pretty common in my case (.../sorry... ; .../Error/...; etc) so we just need to make initial request right from the "bad", for example this is a bad request:
http://www.google.co.uk/sorry/?continue=http://www.google.co.uk/search%3Fq%3Djust+an+example
We can detect it by ../sorry/.. in the middle, and this link will lead us straight to the captcha request page, this is not what we need, we need just: http://www.google.co.uk/search%3Fq%3Djust+an+example
So we are extracting from the "Location" Header "bad" location header and replacing it with a "good" one. Although I have not much of experince, I tried to write a code, atleast I'd like to belive that it's a code, and not just a mess :). Also I'm stuck with one thing, I can't figure out how to pass additional parameters to a HTTP::Proxy if new() method already used.
#!/usr/bin/perl
use strict;
use warnings;
use HTTP::Proxy qw( :log );
use HTTP::Proxy::HeaderFilter::simple;
use LWP::UserAgent;
+
my $ua = LWP::UserAgent->new();
+
+
$ua->proxy(['http'],'http://127.0.0.1:29999');
+
+
$ua->timeout(10);
+
+
$ua->agent('Mozilla/5.0 (Windows NT 6.1) AppleWebKit/534.24 (KHTML, li
+ke Gecko) Chrome/11.0.696.60 Safari/534.24');
open ( LOGFILE, ">>", "/var/log/repeater.log"); # ##<----- 2)
my $proxy = HTTP::Proxy->new(
port => '38374',
agent => $ua,
logfh => <LOGFILE>,
);
#HTTP::Proxy->new(@ARGV); ### <--3)
$proxy->logmask( ALL );
$proxy->push_filter(
host => 'google.com', # only apply to this domain
response => HTTP::Proxy::HeaderFilter::simple->new( sub { my ( $s
+elf, $headers, $response ) = @_;
# skip non redirects
return if $response->code !~ /^3/;
# pick up location
my $location = $headers->header('Location');
# find bad redirections
if ( $location =~ m{google.com/sorry.*} ) {
# change the redirect
my $new_location = $location ;
$new_location =~ s/.*(\/sorry\/\?continue=.*)/$1/gx ;
$new_location =~ s/\/sorry\/\?continue=//;
$headers->header( Location => $new_location );
# print some logging information
$self->proxy->log( ALL,
LOCATION => "$location => $new_location" );
}
}
)
);
$proxy->start;
P.S. Clients are operating through the parent proxies so it makes sense to try to repeat request.
Thanks in advance, Sergey.
| [reply] [d/l] |