This has the hallmarks of the start of a "merry trip through hell". I looked at this:
IO-Socket Bugs
IO::Socket::SSL is not threadsafe. This is because IO::Socket::SSL is based on Net::SSLeay which uses a global object to access some of the API of openssl and is therefore not threadsafe. It might probably work if you don't use SSL_verify_callback and SSL_password_cb.
Non-blocking and timeouts (which are based on non-blocking) are not supported on Win32, because the underlying IO::Socket::INET does not support non-blocking on this platform.
I would interpret this as meaning that there is gonna be trouble on a Win32 multi-threaded deamon (a "service" in Windows lingo) - i.e. the solution is not likely to be either "fast" or "easy".
I would start by trying to replicate the problem on a test box with some "CPU eater" processes to simulate server load. I doubt that the fact that this is made as a .exe file matters - this file is unpacked when the service starts running. I would guess that to make this work really well, you would have to implement some thread coordination into this non-thread safe module? But a kludge may work...
The older version of Perl may be a factor, but the current BUG list seems to indicate that this is not a "magic bullet". If a newer Perl version does help, the fact that this service is deployed as an .exe may actually help as this service can use a different Perl version than the system itself (no interaction - the newer version of Perl gets put into the .exe).
Contrary to some opinions to the contrary, SIGALRM does work on Windows. Although there are quirks (sleep() for example is implemented in terms of SIGALRM - however deamons don't normally "sleep"). Perl at least >=5.7.3 uses what are called "safe" or "deferred" signals by default which prevents their delivery during certain OS functions.
To implement your own "timeout", you probably have to override this and use normal "unsafe" signals - completely trash the "timed-out" child thread (because it is not "safe" to continue) and start over again. This would be "I'm stuck", blow up, completely restart this thread - don't know if that kind of a "patch" would work or not or how hard it would be for you to implement. Look at "safe signals" in Perlipc. If "trash the thread" and "start over" is an option, this might be a good "patch".
I am not experienced at threads. In a fork based server, I would have the child exit(99) after an "un-recoverable" timeout (a "non-safe" SIGALRM). Have the parent get its normal SIGCHLD, and its signal handler would then check your exit status and if "99" (as opposed to "0" or whatever), I'd restart your butt! (Meaning fork another child process with the same "mission" as you had before). Other wiser Monks than me will know how to do this with threads. This kind of a kludge may work "well enough"?
Update: See a recent post by me at, re: alarm hander at Re: Race condition with Mail::Sender::MailMsg? for the "general formula" to alarm a function. You have complicated situation so read the "yeah, but", links. |