Zidane has asked for the wisdom of the Perl Monks concerning the following question:

dear perl monks.

i am somewhat of a perl beginner, and over the last few months have grown to love the language. As an attempt to learn to use various modules, i have written the following perl script. It is the beginnings of a very basic irc client.

the whole script and the configuration file

########## #this is the bit that bugs out, afaik ########## #enter select group read wait loop my @ready_sockets; while (@ready_sockets = $select_group->can_read) { #for each socket..... foreach my $sock (@ready_sockets) { #pull data from the socket my $data = <$sock>; #if the data is null, socket is closed if (!($data)) { print "socket closed\n" if $debug; #pull it out of the select group $select_group->remove($sock); $sock->close; return; } print "socket replied: $data" if $debug; if (substr($data,0,4) eq "PING") { print "pong" . substr($data,4); print $sock "PONG" . substr($data,4); } } }

However, the script seems to have a problem that i cannot solve. when connecting to the irc server, the script correctly receives the first few lines, and then seems to ignore the data on the socket for a random amount of time. Oddly, it always pauses at the same point on each server, but it is a different point on each server. this leads the connection to time out.

it would appear that after reading the first few lines, the $select_group->can_read is not recognising data ready for reading on the socket. I do not know why.

so far i have tried the following to solve this problem:-

could anyone shed any light on this? i have been working on this all day and can see no reason at all why it should behave in this way. any help would be appreciated.

thankyou in advance.

Replies are listed 'Best First'.
Re: select appears to ignore pending data on socket.
by Joost (Canon) on Nov 03, 2007 at 19:11 UTC
    I suspect your problem is related to this:
    #pull data from the socket my $data = <$sock>;
    If the currently available data on $sock does not contain a newline (or whatever you've set the record separator to), that call will block until it does.

    The same sort of thing may happen if you print() to a socket and the other side is slow in receiving.

    You should probably make sure that the sockets are non-blocking (see IO::Socket::INET) and use the recv() and send() calls to recieve and send data - since they will work right on non-blocking calls.

      that's what is so troubling, the data does contain several lines of text, each terminating with a newline. i have captured the raw traffic using wireshark (which in turn uses tcpdump), and verified the data is coming in correctly. the data is there, ready and waiting, it just appears that select is not triggering for it.

      i have also tried using blocking and non-blocking sockets with no success, i receive the same result regardless of the blocking state.

      i had looked at recv, but it appears to receive a sring of data of a set length, unfortunately i do not know the length of each line in advance and i am cautions about reading in too much data to read past the newline and into the start of the next line.

      i have just tried using recv to read the data in from the socket and got exactly the same results. i am sure it is a problem with the select call somewhere. it is just not triggering from the data pending on the socket.

      i fail to understand why select would simply ignore pending data.

Re: select appears to ignore pending data on socket.
by Somni (Friar) on Nov 04, 2007 at 01:55 UTC
    You are mixing buffered operations (print, <$sock>, etc.) with select. This is a no-no. While it can sometimes work, it's probably not going to work well here.

    You need to use sysread and syswrite (or send and recv, though the extra arguments are useless to you at this point). Autoflush is also not going to do any good, given it's for flushing buffers that you shouldn't be using.

    If you were having a problem sysread/syswrite or send/recv then you should post the code that uses them, and ask questions about it.

      my problem is, i am going to need to wait on multiple sockets, and receive variable length input. as far as i am aware, recv and sysread only accept a set length of input to pull off the buffer, is this not correct?

      the socket may (or may not) have several lines waiting to be read too, just for that extra added complication.

      whilst i could write a parser to parse a single recv read into multiple lines, i cant block on the socket, and as far as i am aware, using recv to read more data than is available on the buffer will cause it to block.

      is there something else that can be used instead of select? i started out using the event module, but slowly drifted over to select as it seemed much simpler.

        Please stop using recv. It works fine, but it has arguments you are almost certainly not using. Use sysread instead.

        sysread will return on short reads. Meaning you do have to pay attention to the return value; you are not guaranteed to get all that you asked for. As long as select says there's something to read, sysread will not block.

        You are also, therefore, not guaranteed to get a full line on each sysread, so you will have to do your own buffering. Event.pm and POE are designed to abstract away most of these details, as it's tedious and error-prone to implement them yourself.

Re: select appears to ignore pending data on socket.
by Illuminatus (Curate) on Nov 05, 2007 at 13:08 UTC
    As has already been stated, use of the <> operator is a buffered operation. In the provided loop, you invoke the operator once. If there are 3 newlines within the received data, your program would have received up to the first, but the remaining data could well have been read off the socket by the buffering logic. The select would hang, because the socket is empty, but there is data in the buffer that you have not read.

    I don't think that you have a choice about using non-blocking sockets. If you are receiving variable-length data, I can't recall a way to look at how much is in the socket buffer before reading. You can either read a char at a time, and keep checking via select (not very pretty), or switch to non-blocking.

    You mention having switched to non-blocking mode. Did you do this using sysread/recv? Could we see the code that did not work in this instance?

      the <> operator is a buffered operation. If there are 3 newlines within the received data, your program would have received up to the first, but the remaining data could well have been read off the socket by the buffering logic. The select would hang,

      Precisely. Excellent point; a particular pitfall to <$sock> not clearly pointed out before.

      I don't think that you have a choice about using non-blocking sockets. If you are receiving variable-length data, I can't recall a way to look at how much is in the socket buffer before reading. You can either read a char at a time, and keep checking via select (not very pretty), or switch to non-blocking.

      Wrong. If you use sysread (or recv) with select then there is no problem reading variable-length data. If you don't read enough to drain the input already available, then select will tell you that there is more to be read. If you ask for more data than is available, then sysread will return only what is available to you without hanging.

      Of course, if you are hoping to process the input only as complete lines, then this adds a layer of complexity where you may have to buffer up partial lines read and only process them the next time select tells you that there is data to be read.

      However, the typical archecture for using select is to have an object for each socket which means that buffering up partial lines in this object is no big deal.

      - tye        

      well, what an informative reply.

      i had not considered the buffering logic reading before select checks again, but it does seem to simply explain the problem i am having, thankyou.

      unfortunately, i cannot provide working code atm, i am at work and this is a little home project, however, in short, i moved over to using io::event which apparently allows me to use <> to read the io::select object as though it were a file handle.

      thankyou, it would appear illuminatus has illuminated</P. me ;)