Knowing an IO::Socket handle has reached end-of-file

dash2 has asked for the wisdom of the Perl Monks concerning the following question:

I'm still struggling with the idea of sockets, so here's a question with a practical side and a theoretical side.

The practical side: I have a load of sockets from which I am reading. They are IO::Socket objects attached to an IO::Select. I want to loop through them using while (@handles =IO::Select->select ($sel, undef, $self, 1) ) and read a line at a time. The idea is, it is faster to do this than to read each one in turn: I am trying to parallelize my socket reading.

So far, so good... but I have a problem. Normally, with just one socket, I can read the socket "to the end" by using <> in array context. With my new solution, I don't want to read the socket to the end - I just want one line at a time. But I do want to find out when data has finished coming over a socket. Otherwise, I end up going round and round the select() call.

That's the practical question: how do I find out when a data has finished coming over a socket, the same way perl does when I use the angle operator in array context, so as to remove the socket from my IO::Select?

The theoretical question is: what's the underlying mechanism? I know an HTML page is finished cos it says </HTML>. But how does one know a socket is finished? Does it communicate this fact at all? Or are you just supposed to guess because it has stopped talking to you?

Awaiting enlightenment.

dave hj~

Comment on Knowing an IO::Socket handle has reached end-of-file Select or Download Code

Replies are listed 'Best First'.
Re: Knowing an IO::Socket handle has by Everlasting God (Beadle) on Jul 10, 2001 at 18:59 UTC
Depends on the socket type. Tcp sockets are 'connection oriented', meaning that there is a definite begining and end to the communication channel. These states are singnaled by various controll packets, including a request to open a connection (ok, so it's not quite that simple, with the sync handshaking and whatnot, but that's not important) and one to close it, definitively signaling the end of data, until the connection is reopened. I don't have a clue about what the server you are connecting to is doing, but there is the posibility that is doesn't close the connection when it is through with a block of data, presumably is anticipation of more data to be sent, but I think it would send an eof or eot char. Don't quote me on that last bit though, can't remember off the top of my head. Time to go grab the crab book... Udp, on the other hand, is message oriented, so a listening socket is always ready to recieve data, and if said data happend to include a listening udp port address on the sending machine (a pretty common practice) then a two way data flow can be established, but one can never know if this is the last packet, short of knowing that the remote machine is no longer up or reachable. Sounds like you are using tcp connections, so there is a definite and signaled end to the connection. 'The fickle fascination of and Everlasting God' - Billy Corgan, The Smashing Pumpkins	[reply]
Re: Knowing an IO::Socket handle has by dash2 (Hermit) on Jul 10, 2001 at 16:07 UTC
Hmm. Funny how rereading man pages produces information that you could have sworn didn't exist when you read them a few weeks ago! I guess the practical answer is `eof`, as a function or a method for any `IO::Handle` object. The theoretical question still interests me... dave hj~	[reply] [d/l] [select]
Re: Re: Knowing an IO::Socket handle has by Anonymous Monk on Jul 10, 2001 at 20:29 UTC
No, eof doesn't work. On a socket you are ever at eof, if no data are pending. But if you read with <> not into an array, but into a single variable, perl has no choice and returns a single line (including the eol-char, but not at eof, where it is an empty string)	[reply]
Re: Knowing an IO::Socket handle has by kschwab (Vicar) on Jul 10, 2001 at 20:30 UTC
You seem to be talking about TCP stream oriented connections. In that case, from a berkeley sockets perspective, the EOF is usually detected by a successful read of zero bytes from read(), recv(), or sysread(). Here's a short example of a tcp server that handles multiple connections and detects when they are closed. No warranties, I skipped some error checking to keep it concise. Hope it helps. #!/usr/bin/perl -w use strict; use IO::Select; use IO::Socket; my $s = IO::Select->new(); my $listener=IO::Socket::INET->new(Listen => 5, LocalPort => 9999, Proto => 'tcp'); $s->add($listener); my $buf; while (1) { # loop thru readable sockets for ($s->can_read()) { # readable event on a listener means we have a new # socket connection if ($_ == $listener) { # accept the connection and add to the select # object print "Connection...groovy\n"; my $newsock=$listener->accept; $s->add($newsock); } else { # this is a "regular" socket, not the listener # try and read up to 512 bytes from it my $return=$_->sysread($buf,512); if (!defined($return)) { # undef from sysread means an error print "error: $!\n"; $s->remove($_); $_->close; } elsif ($return == 0) { # 0 from sysread means the connection was closed print "socket was closed\n"; $s->remove($_); $_->close; } else { # a positive int from sysread means # we got some data # change non-printables to a . char $buf =~tr/\0-\37\177-377/./; print "read $return bytes from a socket [$buf]\n"; } } } } [download]	[reply] [d/l]