in reply to Socket Hangs Revisited

That's because of your signal handler. Your parent gets stuck endlessly in the waitpid loop
sub REAPER { {} until ( waitpid(-1, WNOHANG) == -1) }

since waitpid returns the PID of the deceased process, or -1 if there are no more child processes. See waitpid.

Your waitpid call will not return -1 since there's the monitor process hanging around. So waitpid will collect the PID of the child, but then loop forever. If you kill the monitor process, you'll see that your parent returns to its socket to listen. Change your handler to

sub REAPER { 1 until ( waitpid(-1, WNOHANG) > 0) }

and all should be fine. (1 is enough since constructing a hashref each time through the loop doesn't make much sense).

--shmem

_($_=" "x(1<<5)."?\n".q·/)Oo.  G°\        /
                              /\_¯/(q    /
----------------------------  \__(m.====·.(_("always off the crowd"))."·
");sub _{s./.($e="'Itrs `mnsgdq Gdbj O`qkdq")=~y/"-y/#-z/;$e.e && print}

Replies are listed 'Best First'.
Re^2: Socket Hangs Revisited
by mungohill (Acolyte) on Jun 11, 2007 at 16:15 UTC
    Of course, you're right about this but you appear to have made a slightly wrong guess about the solution. I looked at the docco on waitpid (at the perl and c lib level) and couldn't find any specification of what waitpid returned if (a) there was a child process (b) it wasn't defunct and (c) you specified WNOHANG. Experimentally it turned out to be 0, which is kind of logical. So rather than checking against '>0', I'm checking against '== 0'
    I liked your everso discreet point about the empty hashref, lightly dismissing the possibility that I might have be crass enough to think I was specifying an empty block.
      I looked at the docco on waitpid (at the perl and c lib level) and couldn't find any specification of what waitpid returned if (a) there was a child process (b) it wasn't defunct and (c) you specified WNOHANG.

      Huh? from the waitpid section:

      waitpid PID,FLAGS
      Waits for a particular child process to terminate and returns the pid of the deceased process, or "-1" if there is no such child process. On some systems, a value of 0 indicates that there are processes still running. The status is returned in $?. If you say
      use POSIX ":sys_wait_h"; #... do { $kid = waitpid(-1, WNOHANG); } until $kid > 0;
      then you can do a non-blocking wait for all pending zombie processes.

      Emphasis mine. So, if you happen to be on a system where 0 indicates running processes, your test is wrong. It is also wrong to test for $kid == 0 as waitpid returns the pid of the deceased process. Again in your ordering:

      • (a) waitpid returns the PID of the deceased process
      • (b) waitpid returns 0 on some systems
      • (c) WNOHANG means non-blocking waitpid (it should return if there's no PID immediately reported (hence the loop))

      --shmem

      _($_=" "x(1<<5)."?\n".q·/)Oo.  G°\        /
                                    /\_¯/(q    /
      ----------------------------  \__(m.====·.(_("always off the crowd"))."·
      ");sub _{s./.($e="'Itrs `mnsgdq Gdbj O`qkdq")=~y/"-y/#-z/;$e.e && print}