in reply to IO::Socket client does not detect when server network connection dies

Is this normal?

If you are strictly a consumer of data provided by the server, (you only do reads), there is no notification that the server died. You have to infer that it died by noticing that you haven't received anything from it for awhile, usually via an alarm timout via a SIGALRM signal.

If you write to a socket where the reader has gone away, that triggers a SIGPIPE signal.

Your recursive call is weird. You can chew a lot of stack space this way and for nought. Also if connect is failing due to something other than the timeout, your code could bombard the poor server with a huge number of requests and very quickly and all the while burning lots of stack. A simple while loop should suffice, like this or whatever you want anything except pushing deeper into the stack.

my $socket; while ( !$socket = IO::Socket::INET->new ( Proto => "tcp", PeerAddr => "192.168.1.105", PeerPort => "5000", Timeout => "1") ) { sleep 1; #or other backoff algorithm here }
  • Comment on Re: IO::Socket client does not detect when server network connection dies
  • Download Code

Replies are listed 'Best First'.
Re^2: IO::Socket client does not detect when server network connection dies
by halfcountplus (Hermit) on Jul 11, 2011 at 23:26 UTC

    > Your recursive call is weird. You can chew a lot of stack space this way and for nought. Also if connect is failing due to something other than the timeout, your code could bombard the poor server with a huge number of requests and very quickly and all the while burning lots of stack. A simple while loop should suffice, like this or whatever you want anything except pushing deeper into the stack.

    Ditto. As ugly as it may seem, you need to institute a maximum number of times to retry (ie, *not* infinite recursion) and if that limit is reached, you are throwing a significant error and/or informing the CLIENT user. It also makes much sense to use a delay of some seconds in this loop, so you don't get bombed by the client.

    > See Knowing when a socket connection is dead

    The bit in the first reply by "castaway" is the tish: It's dead (disconnected), when can_read() indicates data available, but reading from it gives you no bytes. If you are paranoid like me you also give this a few chances in a loop, but regardless: that is the ONLY way to detect a dropped connection. Read() etc will not necessarily return an error.

      Sounds like you are agreeing with me. The code in the OP could get really nasty. There are things that will cause connect to fail that won't take the 1 second timeout time. Server could get flooded with millions of connect requests per second.

      I showed a fixed one second delay, but better is some variation to move us out of some possible "sync" with others trying to do the same thing. Many retry algorithms implement back-off at an exponential rate to some limit (does a few retries real fast, then starts lengthening the time between retries).

      The main point is not to make any recursive calls.

      my $socket; my $max_tries = 50; my $cur_try = 0; while ( ($cur_try++ < $max_tries) and !$socket = IO::Socket::INET->new ( Proto => "tcp", PeerAddr => "192.168.1.105", PeerPort => "5000", Timeout => "1") ) { sleep 1; sleep 1 if (rand() > .5); #some variance is good }
      As far as doing something when the server goes away. A common scheme would be like this:
      $SIG{ALRM} = 'ALRMhandler'; my $line; while ( alarm(30), $line=<$socket>, defined($line) ) { alarm(0); #turn off alarm ... process data in $line .... } ...here if $line is not defined.... ...another possible "server went away" condition... sub ALRMhandler { ...see discussion... }
      The read is blocking. We could get some characters, but not enough to get us to the \n character, we get "stuck" part way though the reading. So we set alarm to some number of seconds, I just randomly picked 30 seconds. When a defined string is returned, the statement completes and we move into the body of the loop. That immediately turns off the alarm and processes the data. Next iteration does the same thing.

      If for some reason, the read returns an undef value, then we are not in the loop and that means that server went away.

      What happens in the "fall through loop" or ALRM signal depends upon the application. Log an error message perhaps. Then maybe restart the app. redo label will go back to label:. I think this is "safe" in Perl. But heck even just completely restarting by exec'ing the same Perl script again might be just fine also. What's appropriate is up to the OP's goal with his application.