LiTinOveWeedle has asked for the wisdom of the Perl Monks concerning the following question:

Hi moks, first I wan to say thanks to merlyn, especially for his web techniques page. I founded there anonymous proxy which using HTTP::Daemon and LWP::UserAgent to do, what you suppose from proxy :-) But unfortunately I wasn't able to get it to work (win2k, ActiveState Perl 5.6.1 b631, last version of modules availables on CPAN)
For easier debugging I added some "logtofile" procedure and change code slightly. It seems, that there is something wrong with parent and child proceses. Whole proxy do something like this:

Master process started
1. connection
1. child forked
1. child froze !!!!
....
2. conection
2. child forked
2. child froze !!!!
....
3. connection
1. child wake up !!!!
1. child do what we need
1. child stops (due to some error?)
3. child forked
....
4. connection
2. child wake up !!!!
2. child do what we need
2. child stops (due to some error?)
C R A S H !!!!

I am including my code and logfile. Please if you have any idea what s wrong let me know.

Thank you

code sample:

#!/home/merlyn/bin/perl #use strict; use LWP::UserAgent; use HTTP::Cookies; $ENV{PATH} = join ":", qw(/usr/ucb /bin /usr/bin); $|++; $log_file_path = "proxy.log"; $log = 2; ## Copyright (c) 1996 by Randal L. Schwartz ## This program is free software; you can redistribute it ## and/or modify it under the same terms as Perl itself. ## Anonymous HTTP proxy (handles http:, gopher:, ftp:) ## requires LWP 5.04 or later my $HOST = "10.0.0.186"; my $PORT = "8008"; my $REFERER = "http://www.domain.com"; sub prefix { my $now = localtime; join "", map { "[$now] [${$}] $_\n" } split /\n/, join "", @_; } $SIG{__WARN__} = sub { warn prefix @_ }; $SIG{__DIE__} = sub { die prefix @_ }; $SIG{CLD} = $SIG{CHLD} = sub { wait; }; #my $AGENT; # global user agent (for efficienc +y) #BEGIN { # use LWP::UserAgent; # @MyAgent::ISA = qw(LWP::UserAgent); # set inheritance # $AGENT = MyAgent->new; # $AGENT->agent("anon/0.07"); # $AGENT->env_proxy; #} #my $cookie_jar = HTTP::Cookies->new; my $ua = LWP::UserAgent->new; #$ua->cookie_jar($cookie_jar); sub LWP::UserAgent::redirect_ok { 0 } # redirects should pass throu +gh { ### MAIN ### use HTTP::Daemon; my $master = new HTTP::Daemon LocalAddr => $HOST, LocalPort => $PORT; warn "set your proxy to <URL:", $master->url, ">"; my $slave; &handle_connection($slave) while $slave = $master->accept; exit 0; } ### END MAIN ### sub handle_connection { &logging( 2, "Handling conection" ); my $connection = shift; # HTTP::Daemon::ClientConn my $pid = fork(); if ($pid) { # spawn OK, and I'm the parent close $connection; return; } ## spawn failed, or I'm a good child &logging( 2, "I am the child procces" ); my $request = $connection->get_request; &logging( 2, "Creating Request object from browser call" ); if (defined($request)) { &logging( 2, "Fetching request" ); my $response = &fetch_request($request); &logging( 2, "Sending response" ); $connection->send_response($response); &logging( 2, "Response sended, closing connection" ); close $connection; } else { warn "request not defined"; } exit 0 if defined $pid; # exit if I'm a good child with a g +ood parent } sub fetch_request { my $request = shift; # HTTP::Request use HTTP::Response; my $url = $request->url; &logging( 2, "Validating URL" ); if ($url->scheme !~ /^(http|gopher|ftp)$/) { my $res = HTTP::Response->new(403, "Forbidden"); $res->content("bad scheme: @{[$url->scheme]}\n"); $res; } elsif (not $url->rel->netloc) { my $res = HTTP::Response->new(403, "Forbidden"); $res->content("relative URL not permitted\n"); $res; } else { &fetch_validated_request($request); } } sub fetch_validated_request { # return HTTP::Response my $request = shift; # HTTP::Request #print $request; ## uses global $AGENT ## warn "orig request: <<<", $request->headers_as_string, ">>>"; &logging( 2, "Removing header information" ); $request->remove_header(qw(User-Agent From Referer Cookie)); ## warn "anon request: <<<", $request->headers_as_string, ">>>"; &logging( 2, "Requesting page" ); my $response = $ua->request($request); ## warn "orig response: <<<", $response->headers_as_string, ">>>" +; &logging( 2, "Removing cookie" ); $response->remove_header(qw(Set-Cookie)); ## warn "anon response: <<<", $response->headers_as_string, ">>>" +; $response; } # ------------------------------------------------------------------ +------- # logging sub logging { my ( $time, $i ); if ( @_[0] <= $log ) { if ( not open(LOG, ">>".$log_file_path) ) { &produce_error( 7 ); die; } $time = localtime(time); if ( $file_lock ) { flock(LOG, 2); } print LOG "$time \t [${$}] \t @_[1] \r\n"; close(LOG); } }

logfile:

refresh pressed Wed May 29 12:07:31 2002 [720] Handling conection Wed May 29 12:07:32 2002 [-1180] I am the child procces refresh pressed Wed May 29 12:08:40 2002 [720] Handling conection Wed May 29 12:08:40 2002 [-1484] I am the child procces refresh pressed Wed May 29 12:08:50 2002 [720] Handling conection Wed May 29 12:08:51 2002 [-1180] Creating Request object from bro +wser call Wed May 29 12:08:51 2002 [-1180] Fetching request Wed May 29 12:08:51 2002 [-1180] Validating URL Wed May 29 12:08:52 2002 [-1448] I am the child procces refresh pressed Wed May 29 12:09:26 2002 [720] Handling conection Wed May 29 12:09:26 2002 [-1484] Creating Request object from bro +wser call Wed May 29 12:09:26 2002 [-1484] Fetching request Wed May 29 12:09:26 2002 [-1484] Validating URL CRASH!!!! --------------------------- Wed May 29 12:11:19 2002 [1276] Handling conection Wed May 29 12:11:19 2002 [-1156] I am the child procces refresh pressed Wed May 29 12:11:37 2002 [1276] Handling conection Wed May 29 12:11:38 2002 [-1528] I am the child procces refresh pressed Wed May 29 12:13:22 2002 [1276] Handling conection Wed May 29 12:13:23 2002 [-1156] Creating Request object fro +m browser call Wed May 29 12:13:23 2002 [-1156] Fetching request Wed May 29 12:13:23 2002 [-1156] Validating URL Wed May 29 12:13:23 2002 [-1284] I am the child procces refresh pressed Wed May 29 12:13:40 2002 [1276] Handling conection CRASH!!!!

Li Tin O've Weedle
mad Tsort's philosopher

Replies are listed 'Best First'.
Re: Merlyn Proxy sever fail on win32?
by Jenda (Abbot) on May 29, 2002 at 12:57 UTC

    fork() doesn't create processes but threads under Windows. Each thread gets a copy of all the data stored in Perl managed space. The cloning (making a duplicate of the data structures) doesn't affect the private data stored by the XS part of modules. So you only have one copy of that. It also doesn't dup system handles (except I believe sockets and filehandles). This means that

    1. Since some data is shared between threads and the modules were not written to handle this it's possible that more threads will access the data at once and corrupt it.
    2. When a thread finishes all Perl objects are destroyed. The DESTROY method of some object frees some XS memory or some handles. So if the memory or the handle is shared by several threads it will suddenly become invalid.

    To try to fix the first problem you'll need to use some locking to prevent concurrent access and wrap the function/method calls in that. I'm afraid the performance will suffer and it will be hard to find out whether this is the problem and what to lock.

    For the second problem you have to ensure the destroyed objects do not free the memory or the handles. There are several solutions.

    1. You can redefine the objects' DESTROY method by your own in all but one (the last to die) threads. - This'll only work fine if you do not create new objects in the threads after fork().
    2. You can rebless() the objects to a different class. Either to a void class just before you destroy them (hard to do safely) or to a "fake" class that inherits all methods except DESTROY (and maybe Close) from the original one immediately after fork().

It may be hard to do though because the object that needs this treatment may be well hidden inside a different object.

Another problem may be that a module installs an END block that'll free some memory or handles. In that case you'll need to try to "turn it off" by handmodifying some variables. You have to look at the module source of course.

Even if you do all this it is not guaranteed to help. Some modules just do not work well with threads (yet). DBI comes to mind. In that case you might end up changing the code so that only one thread uses that module and provides its functionality to the other threads somehow.

On the other hand I have a script that uses LWP in several threads and it works just fine. I do not create and destroy threads all the time though. I precreate several worker threads and pipe them commands. And it does sometimes crash while exiting (like that it would matter).

  Jenda

P.S.: I wrote something on this some time ago, it's available here.

P.P.S.: Nemyslim, ze bys na ty strance "zalozil" anonymous proxy ;-)

Re: Merlyn Proxy sever fail on win32?
by ajt (Prior) on May 29, 2002 at 12:02 UTC

    Fork is only emulated on Win32, it's not fully compatible, it's not very stable, and though I've known it to work, I'm not surprised it doesn't in your case.

    You may be able to use the Win32::Process module to achieve the similar results. Though you may also have to tinker with the signals too.

    Sorry I can't offer a better solution, others may be able to.

Re: Merlyn Proxy sever fail on win32?
by Anonymous Monk on May 29, 2002 at 14:45 UTC
    can someone throw a 'read more' tag on this?