r1n0 has asked for the wisdom of the Perl Monks concerning the following question:

Hello brother monks,
I am seeking wisdom on how one might use IO::Socket as a Proxy server. I have looked at HTTP::Proxy and tested it. It runs beautifully, and I applaud Philippe Bruhat for his module. Unfortunately, I can't modify his stuff easily to fit my needs, since I am only a simple minded monk.
I have borrowed code from some forums to slap together a proxy server using IO::Socket, but it is not acceptable on the performance side. Basically, it has: a server running that accepts incoming connections from a client. Then, I want it to use the incoming information to connect to the server within the request sent in. I basically want to match exactly what the client browser is sending to the server and forward that on to the actual server providing the content. Here is what I have started with. I thank whoever the author is, wherever you are. The code, though, seems extremely slow and most requests from a client leave the browser hanging indefinitely.
Any help would be most appreciated. I mainly want to speed things up and understand why the code is causing the browser to hang. I know there needs to be more control on the detached threads, too, so the system doesn't spawn too many of them. Since it only is using IO::Socket for the connections (server and client), I figured this would make a pretty fast proxy. Thank you in advance for any help!

#!/usr/bin/perl

  use strict;
  use warnings;
  use URI;
  use IO::Socket;
  use threads('yield', 'stack_size' => 64*4096);

  my $plog_file = "/tmp/passthrough_proxy.log";

  # don't want to die on 'Broken pipe' or Ctrl-C
  $SIG{PIPE} = 'IGNORE';

  my $showOpenedSockets=1;

  my $proxy = IO::Socket::INET->new (
   LocalAddr => '192.168.0.5',
   LocalPort => 7070,
   Type => SOCK_STREAM,
   Reuse => 1,
   Listen => SOMAXCONN,
   #Timeout => 10,
  ) or die "\n\nABORTING START:\n\tCan not open proxy socket on port 7070\n\n";

  binmode $proxy;

  while (my $client = $proxy->accept()) {
   plog("\n\n----Accepted Connection--------\n");

   binmode $client;
   my $t = threads->new(\&fetch, $client);
   $t->detach;
   }

sub fetch{

  my $browser = $_[0];
  my $method ="";
  my $content_length = 0;
  my $content = 0;
  my $accu_content_length = 0;
  my $host;
  my $hostAddr;
  my $httpVer;

  while (my $browser_line = <$browser>) {
   plog("Browser Line: $browser_line");
   if( $browser_line =~ /^Proxy-Connection:/i   ){
  plog("\tThrowing out Browser line due to identifying Proxy\n");
  next;
  }
  unless ($method) {
  ($method, $hostAddr, $httpVer) = $browser_line =~ /^(\w+) +(\S+) +(\S+)/;

  plog("$hostAddr");
  my $uri = URI->new($hostAddr) or print "Could not build URI for $hostAddr";

  $host = IO::Socket::INET->new (
  PeerAddr=> $uri->host,
  PeerPort=> $uri->port,
  Timeout=> 10,
  ) or die "couldn't open $hostAddr" unless $host;

  if ($showOpenedSockets) {
  plog("\n\tOpened ".$uri->host." , port ".$uri->port."\n");
  }

  binmode $host;

  print $host "$method ".$uri->path_query." $httpVer\n";
  plog("$method ".$uri->path_query." $httpVer\n");
  next;
  }

  $content_length = $1 if ( $browser_line=~/Content-length: +(\d+)/i );
  $accu_content_length+=length $browser_line;

  print $host $browser_line;

  last if $browser_line =~ /^\s*$/ and $method ne 'POST';
  if ($browser_line =~ /^\s*$/ and $method eq "POST") {
   $content = 1;
   last unless $content_length;
   next;
   }
  if ($content) {
  $accu_content_length+=length $browser_line;
  last if $accu_content_length >= $content_length;
  }
}
  plog("\n\nThread id --> ".threads->self->tid()."\n");
  plog("No of. threads at present --> ".threads->list()."\n");
  plog("Stack size --> ".threads->get_stack_size()."\n");

  $content_length = 0;
  $content = 0;
  $accu_content_length = 0;
  while (my $host_line = <$host>) {
   print $browser $host_line;
   $content_length = $1 if $host_line=~/Content-length: +(\d+)/i;
   if ( length($host_line) < 1 ){
   close($host);
   last;
   }
  if ($host_line =~ m/^\s*$/ and not $content) {
  $content = 1;
  #last unless $content_length;
  #close($host); #Let's see if we can force a close on no returned info
  next;
  }
  if ($content) {
   if ($content_length) {
   $accu_content_length+=length $host_line;
   #print "\nContent Length: $content_length, accu: $accu_content_length\n";
   last if $accu_content_length >= $content_length;
   }
   }
  }
  $browser-> close;
  $host -> close;

  plog("---- End thread ----> ID: ".threads->self->tid()."\n");
  }

sub plog{
  my $msg = $_[0];
  open(LOG,">>$plog_file") or warn "Can NOT open $plog_file for appending\n\tREASON: $!\n\n";
  if ( ! $msg=~ /\n$/ ){
   $msg = $msg . "\n";
   }
  print LOG $msg;
  close(LOG);
  }

Replies are listed 'Best First'.
Re: IO::Socket for HTTP Proxy
by merlyn (Sage) on Sep 30, 2009 at 17:09 UTC
    Just use socat, and ignore Perl for this:
    socat TCP4-LISTEN:9999 TCP4:otherhost:8888
    proxies local port 9999 to remote port 8888. Too easy.

    -- Randal L. Schwartz, Perl hacker

    The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119.

      Randal,

      Thank you for the response. I would love to use the socat program for such, but this is a test I am running for a bigger picture. I already have an archive that I accessible through an IO::Socket proxy server I have written. Right now, it will only dish out resources that are within the archive. If a requested resource is not within the archive it allows a user to task the resource for collection. I want to provide a method for an excluded list of IP addresses to bypass the archive and go straight to the web through the same proxy server. These excluded systems will be allowed to browse without exclusion or limitation. This is why the code I provided uses IO::Socket for both sides. The code already written uses an IO::Socket server setup as a proxy (for what is in archive), and figuring out how to create the passthrough proxy capability with IO::Socket will prevent me from rewriting a lot of code. I have used socat before and like it for what it is.

      Thanks.