ETA: see bottom, I've reproduced it in simple code, and know roughly what's wrong, but have questions about how to fix it still.
I thought I understood the basics of a forking server; I've certainly seen dozens of code samples. But the one I've written is hanging on the first $socket->read() call in the child (where that socket is the one created in the parent by the connect() call on the listening socket).
I'm running Perl 5.8.8 on Linux, so I should have autoflush set by default (setting it manually hasn't helped, anyway). I'm using IO::Socket::INET.
What I'm seeing exactly is that the child hangs on the read call until it gets a SIGINT (sent as part of the shutdown when I kill the test program), at which point it the read call returns NOT with an error, but with the expected data.
The one thing I've noticed I'm doing differently from the sample servers is that mine reads one line in the parent, before forking the child. (It's getting information to do load balancing, based on the request in the first line of the request; by doing that in the parent, I keep the load information conveniently in one place without needing to do ipc from child to parent. This is for a local protocol and it needs to spread things that used to go to one server across many based on the requests in the first line, so an ordinary load-balancer package won't do what I need. And the call rate will be low, dozens a second rather than thousands.)
I'm wondering if there's magic I have to do to avoid conflicts between the parent and the child on that socket. Should I undef it on the parent after forking the child? I'm pretty sure closing it is wrong (tried it).
I'm also wondering if using $socket->getline() once followed by a string of $socket->read() is causing my problem. The doc seems to indicate they're compatible, at least by omission (it makes a point of saying sysread is NOT compatible with read/readline).
Any thoughts? All the online stuff I've found is either showing very simple servers, or else talking about abstruse details in a particular case.
The logs below show a client connection being accepted, the first line of the request read, routing decision made based on that data, child forked, server connection opened, and two lines sent to the server connection. Then, at 2011-10-12 13:16:35,254 it enters the $socket->read() call, and hangs; until I kick it (by shutting down the test program which sends signals to the proxy, as well as the test client and server). So, at 13:16:52,531 the data that was sent much earlier is read.
2011-10-12 13:16:34,972 DEBUG kcmdproxy(22168):509 Accepted connection + 1 from 192.168.1.23:52674 2011-10-12 13:16:35,002 DEBUG kcmdproxy.io(22168):388 bufrdline: <<spe +cial,endofworld<lf>>> 2011-10-12 13:16:35,028 INFO kcmdproxy(22168):538 Routed "special,end +ofworld" to localhost.localdomain(127.0.0.1):2000 2011-10-12 13:16:35,034 DEBUG kcmdproxy.child(22171):564 In child 2011-10-12 13:16:35,051 DEBUG kcmdproxy.io(22171):435 Output: <<PROXY, +192.168.1.23<lf>>> 2011-10-12 13:16:35,036 INFO kcmdproxy(22168):652 Forked child 22171 2011-10-12 13:16:35,052 DEBUG kcmdproxy.io(22171):435 Output: <<specia +l,endofworld<lf>>> 2011-10-12 13:16:35,130 DEBUG kcmdproxy.child(22171):578 Copying reque +st data from client to server 2011-10-12 13:16:35,170 DEBUG kcmdproxy.io(22171):404 sockcopy start k +eep 0 2011-10-12 13:16:35,214 DEBUG kcmdproxy.io(22171):338 bufrd 2011-10-12 13:16:35,254 DEBUG kcmdproxy.io(22171):344 bufrd a 2011-10-12 13:16:52,531 INFO kcmdproxy(22168):496 Accept returned EIN +TR 2011-10-12 13:16:52,563 DEBUG kcmdproxy.io(22171):349 bufrd res <<8>>: + <<END,END<lf>>> err 2011-10-12 13:16:52,564 DEBUG kcmdproxy.io(22171):435 Output: <<END,EN +D<lf>>>
Any ideas? Going quite batty here, and we needed this done last week.
I recreated the problem in a simple server. This server, when talked to by nc, reproduces my problem exactly (yay!):
#! /usr/bin/perl use strict; use warnings; use IO::Socket; use POSIX qw ( :sys_wait_h :fcntl_h ); use Errno qw ( EINTR EAGAIN ); my $testport = 8080; # This is what runs in the child process sub kidstuff { my $sock = shift; # Read and log whatever comes in BUFRD: while (1) { $! = 0; my $data; my $res = $sock->read($data, 99); # res: #chars, or 0 for EOF, or undef for error die "read failed on $!" unless defined($res); last BUFRD if $res == 0; # EOF print "Read($res): $data\n"; } } $|=1; # autoflush my $listener = IO::Socket::INET->new ( LocalPort => $testport, type => SOCK_STREAM, Proto => 'tcp', Reuse => 1, Listen => 5, ); if (!defined($listener)) { die "Failed to listen on port $testport: $!"; } CLIENT: while (1) { my $client = $listener->accept(); if (!defined($client)) { # Some kind of error if ($! == EINTR) { print "Accept returned EINTR\n"; next CLIENT; } # If it's an undef other than EINTR, maybe not really a client, die "Accept error: $!"; } # Read first line from client my $l1 = $client->getline(); die "client read error $!" unless defined($l1); print "Server, first client line is $l1\n"; # Now fork server my $kid = fork(); die "Fork failed" unless defined($kid); if ($kid == 0) { print "Child $$ running\n"; kidstuff($client); print "Child $$ complete, exiting\n"; exit 0; } # Parent continues here. while ((my $k = waitpid(-1, WNOHANG)) > 0) { # $k is kid pid, or -1 if no such, or 0 if some running none dead my $stat = $?; print "Reaped $k stat $stat\n"; } } # CLIENT: while (1)
The problem is that $socket->read($buf,$size) isn't returning until EOF or $size bytes are seen. I thought of that days ago, and thought I had run a test to eliminate it in my complicated code; but in this simple code it's clearcut (suppose I should have gone for the simple case earlier?).
So, what's the solution? Non-blocking IO? As in $sock->Blocking(0)? What I want is a call that blocks until there is data to return, and then returns (but doesn't care about amount). I could do 1-character reads, and assemble the lines myself, but that seems really stupid-ass. If I set non-blocking IO, how do I distinguish EOF from no data? And how do I avoid spinning and wasting CPU resources? Do I have to resort to select() for something this simple?
(Line-oriented IO is where I started. That turns out not to work because some of the existing clients omit the line terminator at the end of the session, and a line-read hangs waiting for them. And the whole point of this server is to let us redistribute the load, and reshuffle which functions are in which server, and how many copies are run of each server, without having to change clients and go through the whole deployment process again.)
In reply to Socket descriptor passed across fork -- hanging by dd-b
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |