Declarent has asked for the wisdom of the Perl Monks concerning the following question:

Okay, very odd symptom. I'm writing a network server that forks off child processes and handles some data. After I'm done, I exit the child and close the socket handles.

This works great until the 64th iteration of the client. At the 65th attempt, the client fails to connect to the port of the server because the server has exited. This is without an error message of any kind, it just pops like a soap bubble.

This is on a win32 machine using Activestate. Also, it may be unrelated, but the number of handles that windoze sees goes up for each time the agent connects, but it never goes down, even with autoflush, handle closing, and child exiting. If I close the server, the handles go away, if the server crashes, they stay. Forever. What the?

I understand that fork/ithreads are beta, but what other choice do I have for parallel processing multiple connections? What's causing the crash?
Here's the server:

#!/Perl/bin/perl -w # Server Test use IO::Socket::INET; $SIG{INT} = 'IGNORE'; # Sub Defs sub control_socket { print "\nBinding Control Socket to Proto $proto Port $portnum for $lsn +ct listeners.\n"; $catcher=IO::Socket::INET->new( Proto=>$proto, LocalPort=>$portnum, Reuse=>1, Listen=>$lsnct, Timeout=>10) or die "This sucks, can't bind socket" ; $catcher->blocking(0); $catcher->autoflush(1); shutdown $catcher,1; } sub data_socket { $sys_id=$_[0] ; local $dataport=9000 + $sys_id ; local $childpid=$$; print "child:Using Dataport $dataport for system id $sys_id \n"; print "for childpid $childpid\n"; $input_port=IO::Socket::INET->new( Proto=>$proto, LocalPort=>$dataport, Reuse=>1, Listen=>10) #Timeout=>5) or die "child:This sucks, can't bind socket" ; $input_port->blocking(0); $input_port->autoflush(1); $data_port = $input_port->accept; sysread($data_port,$client_data,4096); print "child:Recieved client data of $client_data\n"; $datalength=length($client_data); print $data_port "$datalength\n"; print "child:Done with System $sys_id\n"; close $data_port; close $input_port; close $new_catcher; print "child: exit\n"; exit 0; } # Start Processing $proto="tcp"; $portnum=8888; $lsnct=10; $ct=0; print "Starting Test Server...\n"; control_socket; for ( ;; ) { #control_socket; $input_sysid=0; ++$ct; print "Waiting for new System Connection\n"; $new_catcher = $catcher->accept(); #sysread($new_catcher,$input_sysid,1024); #recv($new_catcher,$input_sysid,1024,0); if ($new_catcher) { $input_sysid = <$new_catcher>; close($new_catcher); } print "input system id is $input_sysid\n"; if ( ! $input_sysid ) {$input_sysid = 0; } print "Testing input for new connection\n"; if ( $input_sysid != 0 ) { print "Forking off new data session\n"; last unless (fork()); print "At Server Iteration $ct\n"; } } close $new_catcher; data_socket($input_sysid); # Server End
Here's the client, it runs every second counting connects:
#!/Perl/bin/Perl -w # W32 Agent use IO::Socket::INET; $ct=0; for ( ;; ) { $catapult=IO::Socket::INET->new( Proto=>"tcp", PeerAddr=>"localhost", PeerPort=>8888, #Timeout=>5 ) or die "Can't connect! DO SOMETHING!"; shutdown $catapult,0; $sys_id=1; $sysport=9000 + $sys_id; ++$ct; print $catapult "$sys_id\n"; print "Send sys id of $sys_id\n"; sleep 1; $data_port=IO::Socket::INET->new( Proto=>"tcp", PeerAddr=>"localhost", PeerPort=>$sysport, #Timeout=>5 ) or die "Can't connect data_port! DO SOMETHING!"; $msg="client1 data 12345\n"; $msglength=length($msg); #for ($i;$i <= 10;$i++){ print $data_port $msg; print "Sent data of length $msglength\n"; sysread($data_port,$servmsg,4096); print "Server said length was $servmsg\n"; if ($servmsg == $msglength){ print "We have a match!\n"; } else { print "************OUT OF SYNC*****************\n"; } #} print "Iteration number $ct\n"; close $data_port; close $catapult; if ( $ct == 50 ) { sleep 60;print "Sleeping\n"; } } # End Client
Many thanks if you can unravel this unruly conundrum!
M

Replies are listed 'Best First'.
(MeowChow) Re: 65 is the magic number!
by MeowChow (Vicar) on May 14, 2002 at 04:01 UTC
    65 is indeed the magic number, because you're hitting the limit on per-process file descriptors for Win32 (I'm guessing 98 / ME, correct?). If you run on Win NT/2000/XP, the max goes up to 255.

    I'm unsure if there's a way to increase your 98/ME fdmax, perhaps you can try setting FILES=255 in your config.sys.

       MeowChow                                   
                   s aamecha.s a..a\u$&owag.print

      I am running 2000, but the file descriptor max idea is interesting. I'm actually wanting to get rid of the fd's after I get rid of the child process, but I can't seem to make them go away unless I close the server.

      Even at 255, I'll crap out at some point unless I can get rid of them when I exit the child.

      Any ideas on how to nuke em?

      D

        Forking under Win32 is emulated; a child process is not actually created. That's probably why the fd's are not going away. See perlfork.
           MeowChow                                   
                       s aamecha.s a..a\u$&owag.print
(MeowChow) Re: 65 is the magic number!
by MeowChow (Vicar) on May 14, 2002 at 10:08 UTC
    Ok, I totally blew it on this one. The problem is that you are launching too many children without reaping any of them; a scenario which reduces to this:
    perl -le "print $p, ':', ++$i while $p = fork"
    The solution under unix would be to hook $SIG{CHLD} and reap them properly, a la perlipc. Unfortunately signals don't work properly under windows, so I'm not sure how you should handle this. Perhaps hack something up which involves waitpid -1, &WNOHANG; when the number of simultaneous children get high. You may also want to consider running under Cygwin.
       MeowChow                                   
                   s aamecha.s a..a\u$&owag.print