Ossory has asked for the wisdom of the Perl Monks concerning the following question:

Hello, monks.

I am not sure if my problem can be helped at all and if it really belongs here, but I am rapidly losing hope, so here goes:

I have a certain Perl application, that accesses a MySQL database on a regular basis (say, about once per second). To interact with a database, I am using "Mysql" (the one that is imported using use Mysql and you connect using $handle = Mysql->Connect(blah... )) module. Note, that I cannot use DBI. I could probably port the whole thing so that it would use DBI, if I was absolutely sure, that the problem would be solved this way.

Now, auto-reconnect is enabled by default. So, basically, when (IF) the remote MySQL daemon stops or something similar happens, the program starts spewing out "MySQL server has gone away" messages (which is perfectly normal and is intended behaviour in such cases - these errors are caught and dealt with so no data is lost). Now, WHEN MySQL comes back, the connection automatically reestablishes and and the program gets on with whatever it is intended to do. This behaviour is not explicitly coded - I believe, it is done automatically by the perl MySQL interface. Now this is not the problem.

The problem is as follows. This program has to be run on certain hardware. This hardware has an unsolved defect - due to a bug in a kernel module, that governs the network interface, the network interface sometimes "hangs". It remains in this state for about a second or two, then it resets and resumes operation. The thing is, that when it happens, the MySQL connection dies (it starts giving "Server gone away" errors). Now, this would be reasonable, the interface is hung. When the interface resets, however, the connection does NOT come back up (it still gives "Server gone away" errors) and makes no attempt to reconnect. If I do a if ($handle) {blah.. } on a handle, the expression in the parens evaluates to true, which, I think, means, that the connection thinks, that it is in "operating" state, and so it makes no attempt to reestablish itself. Now, if it would evaluate to false, I could use this condition to manually reestablish the connection by making a "Mysql->Connect" call. When the program becomes trapped in this "no-network" state (with network, actually, present), all I can do is restart it manually (or through some watchdog script), but this, of course, is not the best solution.

Naturally, I could catch the Mysql error using the error code returned (to isolate the "Server has gone away" error from the other possible errors) and try to reestablish it every time such an error happens, but this can not be done due to various reasons.

Sorry if my explanations make little or no sense to you. Hope someone can at least tell me what to try next.

Replies are listed 'Best First'.
Re: Mysql auto-reconnect doesn't kick in on network interface crash.
by Anonymous Monk on Feb 04, 2009 at 09:13 UTC
    which, I think, means, that the connection thinks, that it is in "operating" state, and so it makes no attempt to reestablish itself
    Find out fo sho. You could try checking auto_reconnects_failed/auto_reconnects_ok.

    but this can not be done due to various reasons.
    Do it anyway :)

      Thanks for the advices, I will look into auto_reconnects_failed/auto_reconnects_ok

      Do it anyway :)
      Well, this is not possible and here's why: the application in question is reading from a fifo. Another app writes into that fifo in nonblocking mode at a very high rate. Attempting to restore the connection from scratch takes quite some time and there is a good chance, that the fifo will overflow, so some data will be dropped from it. Sometimes, it happens that MySQL server, indeed, "goes away" - gets taken down for backup etc etc. So I cannot really attempt to restore the connection every single time the query fails because "The server has gone away". My aim is to isolate the problem, that I described in my original post so I can deal with it in its own way, and leave everything else intact.

        I think you're approaching the problem the wrong way. Either you're not bothered by the events that gets lost when the server legitimately "goes away", or you are.

        If the record are that precious, you should have some kind of a caching mechanism on your side that will read from the fifo and store the records until you can send them to the Mysql server. It could be a simple thread that stores stuff into one end of the array, while your thread takes them out from the other end.

        Otherwise, don't sweat the time you lose trying to reconnect to a server that has legitimately gone away - if you can't store the stuff you're not sending to the server, the events sent while the server is down are lost, anyway.