mpaduano has asked for the wisdom of the Perl Monks concerning the following question:

Hello.

I have an unusual problem with IO::Select->select(). A client will send a message and close its side of the connection. With a TCPDUMP, I can observe data arriving at the server and being ACK'd at the TCP layer (the FIN is ACK'd in the same reply). Yet on the server, minutes will pass before my select() call returns the socket as ready to receive data. Other sockets in the fd_set will receive data and get processed during this interval. Nothing appears blocked or frozen, except this one socket (or sometimes a handful of them at once). Then, after three to ten minutes, the socket is returned by select() and everything proceeds (although the clients have sometimes timed out by this point).

the problem occurs repeatedly, but is not directly reproducible. It will only affect a small fraction of the total sockets being processed by the server.

We are using windows server 2003, AS perl 5.6.1, build 635.

We have a support ticket open to Microsoft since it looks like it may well be an windows issue, but it is quite difficult to prove that to them... They are currently arguing that the perl implementation is the issue. I cannot rule that out, so I am seeking any suggestions/knowledge that may help me get to the bottom of this problem.

thanks for any help!

matt

Replies are listed 'Best First'.
Re: Windows socket select() problems ?
by rcaputo (Chaplain) on Jan 25, 2004 at 17:37 UTC

    Then again, it might be in your code. We can't really rule that out until you show us a test case. :/

    -- Rocco Caputo - rcaputo@pobox.com - poe.perl.org

Re: Windows socket select() problems ?
by rcaputo (Chaplain) on Jan 25, 2004 at 17:46 UTC

    Have you tried ActiveState's bug tracker for ActivePerl? (Sorry, I know they have one, but I always forget where they hide it.) Perhaps the issue is already reported.

    The latest ActivePerl is 5.8.2, build 808. Maybe the problem is already fixed?

    Perhaps the problem's in your source code. We can't really rule that out until you post a test case.

    -- Rocco Caputo - rcaputo@pobox.com - poe.perl.org

      I am currently trying to (a) reproduce the problem and (b) reproduce it with only a small extract from our mondo system... Presently we only encounter the issue in our production system and even there it is quite rare. if I succeed at reproducing it and still can't figure out what is wrong, I'll post some code. I was hoping someone might recognize the symptoms or have a less than obvious suggestion.

      thanks.

      matt