skamerman has asked for the wisdom of the Perl Monks concerning the following question:

I wrote the following code to listen to a TCP port on a device and simply log it's output. It seems to work fine for about 10-20 hours, but somewhere along the line it just stops, and I can't figure out why. I put the socket timeout all the way up to 604800 sec (7 Days) hoping that it would help, but no luck. I have been using a script to restart the process every hour so it doesn't die on me anymore, but I would like it to just work. Any ideas?

---
use Net::Telnet(); use strict; my ($host, $port, $socket, $line, $filename); $host = $ARGV[0]; $port = $ARGV[1]; $filename = $ARGV[2]; $socket = new Net::Telnet (Telnetmode => 0,Timeout => 604800); $socket->open(Host => $host,Port => $port); for(;;){ $line = $socket->getline; open(LOGFILE, ">>$filename"); print LOGFILE $line; close(LOGFILE); } exit 0;
---

Steve Kamerman
President
HardwareTechNet
www.hardwaretechnet.com

Replies are listed 'Best First'.
Re: Problem Keeping Socket Alive
by Mr. Muskrat (Canon) on Dec 23, 2002 at 16:55 UTC

    One problem is that the timeout method "sets the timeout interval that's used when performing I/O or connecting to a port. When a method doesn't complete within the timeout interval then it's an error and the error mode action is performed." If the remote process isn't going to output something for 7 days then you should disconnect from the socket, sleep(604800) and then reconnect. In other words, lower your timeout value and look elsewhere for the problem.

    Another problem is that you assume that the process runs all the time and that there will never be a loss of connectivity. How about checking that the connection still exists before (or when) trying to read from the socket?

      This program is logging phone calls in real time over ethernet, so I could be waiting from 1 sec to 5 hours for more output on the socket, depending on whether anybody calls or not. I will lower the timeout, though. Maybe I'll try 7200sec (2 hours)

      I explained the problem poorly - the script doesn't actually "die" - it's still running, just not logging the output anymore, and if I try to telnet to my device at the port that my script is supposed to be listening to, I am refused - so I know the socket is still open.
        What does netstat show the state of the conn being when this error state happens? Also do you have control of the sending side of this? if so, it might make sense to do a heartbeat (a dummy packet that gets discarded in the scope of the log) every few minutes that forces all firewalls etc between the two hosts to think that the connection is active over long periods of inactivity.

        Edited:
        Also, getline is fine. And since you don't need this to timeout at all, you can use undef for the timeout value to turn off timeouts completely.

        -Waswas
Re: Problem Keeping Socket Alive
by MarkM (Curate) on Dec 23, 2002 at 21:33 UTC

    Timeout values passed to alarm(), select(), poll(), or other such system calls do have maximum values. If the value you specify exceeds the maximum value, the usual behaviour is for the operating system is to use the maximum value. Your implementation is not reliable due to this limitation.

    One solution for you may be to disable the timeout code for the getline() method: $socket->getline(Timeout => undef);

      Thanks - I did not know I could set the timeout for getline. I will try it tonight and post my results tomorrow - yes, I do have to work on Christmas Eve :(