Re: Why Coro?

If you attempt to run “3,000 threads,” let alone 12,000 ...

"You're dead, Jim ..." — Bones

The number of threads should be determined by the number of requests that you can actually process “at the same time” on your hardware. It should be a variable number, and it should be fairly small.

A small number of threads can very efficiently serve a large number of devices, on the presumption that “not every device will be sending data to us at the same instant.” There should be these thread pools:

A very small number of threads (perhaps only one ...) that gathers the incoming requests as they arrive, and places them onto a queue.
A somewhat larger pool of threads that services the queue. (They also note when the last request arrived from each. Perhaps there is a “watchdog” thread that periodically looks for dead birds.)
If necessary, a third small set of threads that sends acknowledgment responses back to the devices, unless the first-pool threads can also handle this duty.

You can easily see how this works, and how it will be easily tunable. We need to gather requests with an adequate level of latency, and to know if a device is dead, so that's what the first (and third) threads do. Then, we need to be sure that the threads can be processed effectively once received, without bottlenecks, and this is what the second group does. Because of the presence of the queue, nothing will get out of hand.

Also note that there are many CPAN packages which are already built to implement this sort of thing, because it is a very common scenario. (Heck, it dates all the way back to IBM's “CICS” product for the earliest mainframes.) Never do a thing that has already been done... it is very easy to find yourself doing exactly that.