planetjeff has asked for the wisdom of the Perl Monks concerning the following question:

Hi all - first time poster. One of my tcp server modules of a larger program is designed to be able to read from many tcp client connections at the same time. I do this with hash tables and a socket select, so no forking is involved. However, a major anomaly is occurring where, if the client is in the middle of writing data to my server and the client restarts, my server ends up hanging and cannot service any more connections, new or already open. Here is a code snippet:
while (1) { # Set up bit vectors for polling my $fin = ''; my $fout; vec ($fin, fileno ($sock) , 1) = 1; foreach my $streamID (keys %openStreamsSock) { vec ($fin, fileno($openStreamsSock{$streamID}) , 1) = 1; } # Wait for incoming message my $nfound = select ($fout=$fin, undef, undef, 5); if ($nfound) { if (vec($fout, fileno($sock),1)) { $openStreamsSock($streamRequest++) = $sock->accept(); } else { foreach my $streamID (keys %openStreamsSock) { if (vec($fout, fileno($openStreamsSock{$streamID}),1)) { # read data off the socket; not a message here, just r +aw data $msgSize = sysread ($openStreamsSock{$streamID}, $msgReceived, 1048576); if (defined ($msgSize) && ($msgSize > 0)) { writeStreamData ($streamID, $msgReceived); } else { # $msgSize being 0 indicates end of stream, or # $msgSize being undef indicates error, so close close ($openStreamsSock{$streamID}); delete ($openStreamsSock{$streamID}); } } } } } else { print "$0: Normal timeout of select...\n"; } }
This code usually works like gangbusters, and I have no trouble processing multiple client streams at the same time. However, if the client is unexpectedly shut down, my code seems to just hang and won't accept any more connections or process data on any currently open connections. Originally, I didn't have the timeout of 5 seconds in the select, but I added it in for debug. And what I noticed is that once the client is killed, even the default timeout stops occurring. It's as if the select stops working altogether. I'm hoping some other eyes would shed light on what could be going wrong here. Any thoughts? Thanks much for your consideration.

Replies are listed 'Best First'.
Re: socket select hangs after client restarts
by BrowserUk (Patriarch) on Feb 18, 2010 at 00:49 UTC

    I have no conclusions or solutions to offer, but an observation prompted by my own limited experience of using select. It might prompt others to authoritative answers.

    You are not handling error conditions. The 4-arg select is defined as select RBITS,WBITS,EBITS,TIMEOUT where EBITS are those streams in your select group that have experienced error conditions.

    My touchy-feely findings are that unless you handle those--using shutdown & close--you will be subject to a the default timeout which is often set to 900 seconds. Ie. 15 minutes. If, when your code freezes on you, you wait 15 minutes or so, you may find that it starts working again. Not a tenable situation, but it might help point to a fix. Or not!


    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority".
    In the absence of evidence, opinion is indistinguishable from prejudice.

        Yes, I know, but I do not think it is the correct way to deal with the OPs problem. Better, I believe, to handle EBITS and shutdown & close streams with errors.

        But, as I've only done this stuff on Windows, and from the mention of fork I'm assuming the OP is on *nix, and WinSocks have a habit of behaving subtly (and not so subtly) differently, I'm not prepared to put any weight behind that assertion.


        Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
        "Science is about questioning the status quo. Questioning authority".
        In the absence of evidence, opinion is indistinguishable from prejudice.
Re: socket select hangs after client restarts
by zentara (Cardinal) on Feb 18, 2010 at 12:05 UTC
    This might be a windows only problem, but when I was testing sockets, I found I needed to do a check to $sock->connected during the select read loop. Read perldoc IO::Select and look for the "connected" method. If you check that before trying to read, you can let that client go, and find the new one that logged in. See ztk-enchat encrypted server client for tricks like checking if the socket is connected, or sometimes, check if it's still defined.

    I'm not really a human, but I play one on earth.
    Old Perl Programmer Haiku
Re: socket select hangs after client restarts
by 7stud (Deacon) on Feb 18, 2010 at 21:39 UTC
    Originally, I didn't have the timeout of 5 seconds in the select, but I added it in for debug. And what I noticed is that once the client is killed, even the default timeout stops occurring. It's as if the select stops working altogether.

    Why do you assume your program is hanging on the select()?

    This code usually works like gangbusters

    I don't see how that's possible:

    1) What does the second line here do:

    for (keys %openStreamsSock) { ... ... $openStreamsSock($streamRequest++) = $sock->accept();

    2) This causes an error for me:

    $openStreamsSock($streamRequest++) = $sock->accept();

    ..even when I use IO::Socket.

    3) You never add new connections to the $fin string:

    if (vec($fout, fileno($sock),1)) { $openStreamsSock($streamRequest++) = $sock->accept(); }

    As a result, when you ask select() to check all the filehandles in your $fin string:

    my $nfound = select ($fout=$fin, undef, undef, 5);

    the select never checks to see if any of the new connections are readable. Therefore, this if-statement:

    if (vec($fout, fileno($openStreamsSock{$streamID}),1))

    fails for any new connection.

    4) This conditional:

    if (defined ($msgSize) && ($msgSize > 0))

    is equivalent to:

    if ($msgSize > 0)

    or simply:

    if ($msgSize)

    Although, I think the '> 0' conditional is more specific and therefore clearer.

    5) You never delete closed connections from the $fin string:

    else { # $msgSize being 0 indicates end of stream, or # $msgSize being undef indicates error, so close close ($openStreamsSock{$streamID}); delete ($openStreamsSock{$streamID}); }

    Therefore, your select() will continue to check whether all your filehandles are readable--even filehandles for disconnected sockets. I wouldn't think that would be a problem; I would think that select() would say that a filehandle on a closed socket wouldn't be readable. You would be just unnecessarily checking the filehandle. However, that is in fact your problem. I replaced your $sock->accept() line with this:

    say "in 2nd if"; #$openStreamsSock{$streamRequest++} = $LISTEN_SOCK->accept(); my $packed_remote_addr = accept(my $CONNECTION, $LISTEN_SOCK) or warn "Couldn't connect: $!"; say "before 3rd if"; if ($packed_remote_addr) #then created a connection { vec($fin, fileno $CONNECTION, 1) = 1; $openStreamsSock{$streamRequest++} = $CONNECTION; }

    And after a client disconnects, somehow the code blocks on the accept(). Inexplicably, after a client disconnects the select() says that the $LISTEN_SOCK is readable--in other words, select() says a client is trying to connect, which doesn't make any sense. Unfortunately, I have no idea why that is the case. Fortunately, there is a solution: remove the closed socket from $fin,

    else { # $msgSize being 0 indicates end of stream, or # $msgSize being undef indicates error, so close vec($fin, fileno $openStreamsSock{$streamID}, 1) = 0; close ($openStreamsSock{$streamID}); delete ($openStreamsSock{$streamID});

    Make sure you remove the socket from $fin *before* closing the socket.

    The following is a working example that avoids all the problems you experienced--by properly adding and removing filehandles from $fin. Using this select():

    select($rout=$rin, undef, undef, undef)

    worked fine for me.

    #CLIENT(run in multiple terminals to simulate multiple clients) use strict; use warnings; use 5.010; use Socket; my $protocol = getprotobyname 'tcp'; socket my $SOCK, AF_INET, SOCK_STREAM, $protocol or die "Couldn't create socket: $!"; my $port = 12555; my $host = 'localhost'; my $packed_host = gethostbyname $host or die "Unknown host: $!"; my $sock_addr = sockaddr_in $port, $packed_host; connect $SOCK, $sock_addr or die "couldn't connect: $!"; my $old_out = select $SOCK; $| = 1; select $old_out; print "Enter some text: "; while (my $to_send = <STDIN>) { #ctrl+C to kill client, ctrl+D to sen +d eof print $SOCK $to_send; } close $SOCK;
    #SERVER: use strict; use warnings; use 5.010; use Socket; my $protocol = getprotobyname 'tcp'; socket my $LISTEN_SOCK, AF_INET, SOCK_STREAM, $protocol or die "Can't make socket: $!"; setsockopt $LISTEN_SOCK, SOL_SOCKET, SO_REUSEADDR, 1 or die "Cant set SO_REUSADDR: $!"; my $port = 12555; my $listen_addr = sockaddr_in $port, INADDR_ANY; bind $LISTEN_SOCK, $listen_addr or die "bind failed: $!"; listen $LISTEN_SOCK, 5; warn "processing sockets...\n"; my $rin = ''; my $rout; vec($rin, fileno($LISTEN_SOCK), 1) = 1; my %open_sockets; my ($remote_host, $remote_port); while (1) { my $nfound = select ($rout=$rin, undef, undef, undef); if ($nfound) { if (vec $rout, fileno $LISTEN_SOCK, 1) { my $packed_remote_addr = accept my ($CONNECTION), $LISTEN_ +SOCK; if ($packed_remote_addr) { vec($rin, fileno $CONNECTION, 1) = 1; ($remote_port, my $packed_remote_host) = unpack_sockaddr_in($packed_remote_addr); $remote_host = inet_ntoa $packed_remote_host; warn "adding new connection to hash...\n"; $open_sockets{fileno $CONNECTION} = [$CONNECTION, "$remote_host : $remote_port"]; } else { warn "couldn't connect"; } } for my $filenum (keys %open_sockets) { my ($CONN, $remote_info) = @{$open_sockets{$filenum}}; if (vec $rout, $filenum, 1) { my $available_data; my $result = sysread $CONN, $available_data, 8096; if ($result) { #data was read from socket print "[$remote_info] says: $available_data"; } elsif ($result == 0) { #eof=>socket closed delete $open_sockets{$filenum}; vec($rin, $filenum, 1) = 0; close $CONN; say "[$remote_info]: deleted from hash. Goodbye!"; } else {#undef=>IO error on socket warn "[$remote_info]: experienced an IO error: $!" +; } } } } }

      Many thanks for your response - let me take this by the numbers:

      1,2,3) It was actually a typo, caused by me editing out superfluous code/comments/debugs, and a manual typing error. That line should have been:

      $openStreamsSock{$streamRequest++} = $sock->accept();

      $streamRequest is just stream counter. New client connection sockets are added to the hash, with the $streamRequest the unique key.

      4) Thanks for the coding economy!

      5) You're right, I wasn't depopulating $fin (now $rin, to match standard), but at the top of the loop I reinitialized $fin and then set it up again by adding the listening socket and all currently open sockets. That should have taken care of initializing it correctly each time I get to the select.

      Here's the updated code, adding in checking EBITS:

      while (1) { # Set up bit vectors for polling my $rin = ''; my $ein = ''; my $rout; my $eout; vec ($rin, fileno ($sock) , 1) = 1; vec ($ein, fileno ($sock) , 1) = 1; foreach my $streamID (keys %openStreamsSock) { vec ($fin, fileno($openStreamsSock{$streamID}) , 1) = 1; vec ($ein, fileno($openStreamsSock{$streamID}) , 1) = 1; } # Wait for incoming message my $nfound = select ($rout=$rin, undef, $eout=$ein, 5); if ($nfound) { # Check client streams, close any in error foreach my $streamID (keys %openStreamsSock) { if (vec($eout, fileno($openStreamsSock{$streamID}),1)) { close ($openStreamsSock{$streamID}); delete ($openStreamsSock{$streamID}); } } if (vec($rout, fileno($sock),1)) { $openStreamsSock{$streamRequest++} = $sock->accept(); } else { foreach my $streamID (keys %openStreamsSock) { if (vec($rout, fileno($openStreamsSock{$streamID}),1)) { # read data off the socket; not a message here, just r +aw data $msgSize = sysread ($openStreamsSock{$streamID}, $msgReceived, 1048576); if (defined ($msgSize) && ($msgSize > 0)) { writeStreamData ($streamID, $msgReceived); } else { # $msgSize being 0 indicates end of stream, or # $msgSize being undef indicates error, so close close ($openStreamsSock{$streamID}); delete ($openStreamsSock{$streamID}); } } } } } else { print "$0: Normal timeout of select...\n"; } }

      I'll have to look at your example a bit - it looks like you're using some different constructs. Thanks again!

Re: socket select hangs after client restarts
by 7stud (Deacon) on Feb 19, 2010 at 00:53 UTC
    # Check client streams, close any in error

    I don't think that is what $eout indicates. I know BrowserUK said:

    The 4-arg select is defined as select RBITS,WBITS,EBITS,TIMEOUT where EBITS are those streams in your select group that have experienced error conditions.

    but that is not my understanding of $eout. As far as I can tell, $eout indicates file handles with urgent data--not filehandles that have experienced an IO error. In other words, 'exceptional conditions' does not mean 'exceptions'.

    5) You're right, I wasn't depopulating $fin (now $rin, to match standard), but at the top of the loop I reinitialized $fin and then set it up again by adding the listening socket and all currently open sockets. That should have taken care of initializing it correctly each time I get to the select.

    You're right, I missed that. I chopped out that part of your code to fit it into my example. I'll have to add that back in and see if I can track down what's going on.

    I also worked up an example using IO::Socket and I0::Select, which allows you to dispense with all the bit twiddling (which I loathe). The server code is greatly simplified:

    #SERVER: use strict; use warnings; use 5.010; use IO::Socket; use IO::Select; my $LISTEN_SOCK = IO::Socket::INET->new( LocalPort => 12555, Listen => 5, Reuse => 1, ) or die "Couldn't create socket: $@"; my $sock_group = IO::Select->new() or die "Couldn't create select"; $sock_group->add($LISTEN_SOCK); warn "listening for connections...\n"; while (1) { my @ready_to_read = $sock_group->can_read; for my $READABLE_SOCK (@ready_to_read) { if ($READABLE_SOCK eq $LISTEN_SOCK) { my $NEW_CONNECTION = $LISTEN_SOCK->accept() or warn "Couldn't connect: $!"; if($NEW_CONNECTION) { $sock_group->add($NEW_CONNECTION); } } else { my $status = $READABLE_SOCK->sysread( my($available_data), + 8096 ); if ($status > 0) { #read some data STDOUT->syswrite($available_data); } elsif ($status == 0) { #read eof=>client closed socket $sock_group->remove($READABLE_SOCK); close $READABLE_SOCK; } else { #undef=>IO error warn "IO error while reading socket: $READABLE_SOCK"; } } } }
    #CLIENT: (same as previous)
Re: socket select hangs after client restarts
by 7stud (Deacon) on Feb 19, 2010 at 09:30 UTC

    When I add your for loop to the top of my example, I can't duplicate the hanging you are experiencing. Can you try running the following server as well as two clients in separate windows? At the prompt in the clients, type in some messages and hit return, then observe how the server displays the messages in its terminal window. Then hit ctrl+C in one of the clients. Then enter a message at the prompt in the other client. See if the server displays the message in its terminal window.

    Note: I have included some debugging output too, which you will see in the server window.

    Note2: recreating the bit string every time through the loop is not very efficient. You should just add new connections to the bit string when new connections are detected, and remove connections from the bit string as connections are deleted (and better yet dispense with all the bit twiddling and use IO::Select).

    #SERVER: use strict; use warnings; use 5.010; use Socket; #SERVER: my $protocol = getprotobyname 'tcp'; socket my $LISTEN_SOCK, AF_INET, SOCK_STREAM, $protocol or die "Can't make socket: $!"; setsockopt $LISTEN_SOCK, SOL_SOCKET, SO_REUSEADDR, 1 or die "Cant set SO_REUSADDR: $!"; my $port = 12555; my $listen_addr = sockaddr_in $port, INADDR_ANY; bind $LISTEN_SOCK, $listen_addr or die "bind failed: $!"; listen $LISTEN_SOCK, 5; warn "processing sockets...\n"; my %openStreamsSock; my $streamRequest; my $count = 0; while (1) { # Set up bit vectors for polling my $fin = ''; my $fout; vec ($fin, fileno ($LISTEN_SOCK), 1) = 1; foreach my $streamID (keys %openStreamsSock) { vec ($fin, fileno($openStreamsSock{$streamID}), 1) = 1; } # Wait for incoming message my $nfound = select ($fout=$fin, undef, undef, undef); say "select worked after child ended" if $count == 1; if ($nfound) { if (vec($fout, fileno($LISTEN_SOCK),1)) { say "in 2nd if"; #$openStreamsSock{$streamRequest++} = $LISTEN_SOCK->accept(); my $packed_remote_addr = accept(my $CONNECTION, $LISTEN_SOCK) or warn "Couldn't connect: $!"; say "before 3rd if"; if ($packed_remote_addr) { say 'in 3rd if'; $openStreamsSock{$streamRequest++} = $CONNECTION; } } say 'starting for loop'; foreach my $streamID (keys %openStreamsSock) { say "in for loop after child ended" if $count == 1; if (vec($fout, fileno($openStreamsSock{$streamID}),1)) { # read data off the socket; not a message here, just raw d +ata my $msgSize = sysread ($openStreamsSock{$streamID}, my $msgReceived, 1048576); if ($msgSize > 0) { #writeStreamData ($streamID, $msgReceived); syswrite(STDOUT, $msgReceived); } else { # $msgSize being 0 indicates end of stream, or # $msgSize being undef indicates error, so close #vec($fin, fileno $openStreamsSock{$streamID}, 1) = 0; close ($openStreamsSock{$streamID}); delete ($openStreamsSock{$streamID}); say 'did deleting'; $count = 1; } } } } else { print "$0: Normal timeout of select...\n"; } }
    #CLIENT: use strict; use warnings; use 5.010; use Socket; my $protocol = getprotobyname 'tcp'; socket my $SOCK, AF_INET, SOCK_STREAM, $protocol or die "Couldn't create socket: $!"; my $port = 12555; my $host = 'localhost'; my $packed_host = gethostbyname $host or die "Unknown host: $!"; my $sock_addr = sockaddr_in $port, $packed_host; connect $SOCK, $sock_addr or die "couldn't connect: $!"; my $old_out = select $SOCK; $| = 1; select $old_out; print "Enter some text: "; while (my $to_send = <STDIN>) { print $SOCK $to_send; } close $SOCK;

      Okay - problem has been solved, and I also have a slight confession.

      I actually edited down my program a bit more than I should have when I posted it, and it didn't reflect it's true functionality. When a new socket was opened, I was immediately reading data with a sysread. (the first block of data after an open socket is supposed to contain the filename for the server to write the following data to, so I always assumed the client would be a 'good' client and have that data read to be read on open!) The code was hanging on that particular sysread. Sometimes the client requests an open socket and does not send any data. Why the code would just hang on sysread baffles me, but that is exactly what it was doing. I did have the code to process successfully read data, zero data, and undefined return from sysread, but in this one special cases it would never break out of the sysread function. Why is that?

      My pared down code - and yours, too - gave me the idea to just accept the socket request and put it on the select queue to wait for data to be read. Problem solved. I now wait for data before reading it, and it now handles good and bad clients.

      Many thanks for the help, monks. I hope this in turn can help other who experience the same situation. Don't assume clients will behave!

      p.s. I would be interested if anyone has insight why sysread hangs in this sort of scenario...thx!