in reply to Re: IO::Socket doesn't detect lost TCP connections
in thread IO::Socket doesn't detect lost TCP connections
The receiving system is a vendor package and keepalive heart beats are not part of their system. Once this script is functioning at a production level, variations of it will be used for multiple systems. Out of the ones that I am familiar with, only one uses a heartbeat. It would be prohibitive to pay all of the vendors to modify their systems.
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re^3: IO::Socket doesn't detect lost TCP connections
by DaveH (Monk) on Sep 04, 2004 at 16:55 UTC | |
I have implemented similar gateway interfaces in the past, and one thing to bear in mind is that a "keep alive" message does not need to be an explicit thing written into the TCP/IP protocol you are using. Anything which will test whether or not a connection is alive is sufficient. For example, you may be able to send some sort of benign transaction which is valid in the currently defined protocols, but doesn't actually have any effect on the running backend system (i.e. is some sort of read-only query). The would serve the same purpose as an explicitly written keep-alive packet. Obviously, you will need to investigate whether such a query exists and would be suitable for this purpose. The other thing which I would do is to not use the buffered input and output functions for doing production socket work. I prefer using sysread/syswrite for socket reads and writes because you can detect things like dropped connections and end of file conditions more easily. You can also see how much data was read or written to the socket with each operation, so you can detect short read/write conditions. Also, I prefer using IO::Select to see whether my sockets can be read from or written to, and I use this to implement my own timeout mechanism. However, I would recommend seaching on CPAN for TCP, since this threw up lots of interesting higher-level modules which should hide some of the low-level socket guts from your program. Also I would always recommend looking at POE for any socket programming work, since many socket-based programs fall into the event driven category (i.e. "wait for X condition, then do Y"), and POE is one of the best frameworks for achieving event based programming quickly. It is worth having a look at the POE website in addition to the POD documentation on CPAN. You also mentioned that you are using MQSeries for your message queueing. Are you aware that there is an MQSeries Perl module available on CPAN? This may be more suitable than calling out to an external program to retrieve messages. Find below the sort of code I have used in the past for socket operations: use this at your own risk, and bear in mind that this is not working code. You will need to customise it. Provided in the hope that it is useful. Read more... (4 kB)
I hope that this helps. Cheers, -- Dave :-) $q=[split+qr,,,q,~swmi,.$,],+s.$.Em~w^,,.,s,.,$&&$$q[pos],eg,print | [reply] [d/l] |
by tjdmlhw (Acolyte) on Sep 04, 2004 at 21:42 UTC | |
Thanks for all the useful information. I am taking Labor Day off, but will definitely try some of your suggestions when I get back to the office Tuesday. I am aware of the MQSeries Module in CPAN and have used it in variations of my script on my PC. Unfortunately, to load the module, you need a C compiler and there wasn't one on my AIX Node. Since this is a production box, the systems group wouldn't let me load a compiler. A search of Monks turned up the q and qc programs furnished by IBM and I started using those. The systems people later agreed to install the compiler, but qc has been working well for me and I haven't bothered to switch back. | [reply] |
by Anonymous Monk on Sep 04, 2004 at 17:20 UTC | |
| [reply] |
by tjdmlhw (Acolyte) on Sep 07, 2004 at 19:18 UTC | |
I tried the above program, but it doesn't seem to do what I need. I added a few prints to show some of the return values and a sleep to give me time to bounce the receiving test tool. I've shown the test results and the code as used below. The first test was just connecting, sending, and receiving data without any interuptions. This worked without any problems.
Next I established the connection and used the sleep time to kill it before a send was attempted. The code did not recognize that the connection was dead on the write, but did on the following read.
The last test was stopping and starting the receiving system before a transaction was attempted. The results were exactly the same as for the killing the connection test above.
What I was hoping for was someway of detecting on or before the write that the connection had been lost. I am currently working on modifying my script to trigger a reconnect when the read fails. This will work for the current interface, but may not for future interfaces that don't send an ACK back. If you know of anything else that I can try, I would appreciate the suggestion.
| [reply] [d/l] [select] |