Perhaps I was not clear in my original post, but by "threads" I meant the situation when a server forks/creates a new thread to handle new incoming connections:
... My concern is that because of the use of threads in such framework
+s, keeping a DB handle over time will lead to problems.
I think it is a pretty sensible thing to ask: what happens to the parent's DB handle on fork and what's the good practice in this case. Perhaps a db handle should not be created on the parent (i.e. before a fork) but only created on each child (after forking), and closes when the child's life ends. But that excludes the use of DB handles pool and also the persistent use of a DB handle "for ever" within the server (and not within on each of its children).
I have updated my question to include this clarification and also test code for a simple mojo server ... which unfortunately refuses to fork! | [reply] [d/l] |
By default, Mojo still doesn't fork :-) It is designed specifically to enable the style of programming where you have only one process and it reacts to an incoming connection (event) to do a little processing, and if it needs to do a slow operation like a database query it starts that operation (linked to a callback) and then returns to look for more incoming connections. If the next event is a new request, it also begins that request and may also leave it hanging on a long operation linked to a callback. If one of the operations completes (database returns results) that is an event which gets picked up next and executed some more, possibly finishing the request and sending it back to the client. At any given moment, there is still only one thread of one process doing the work, but it can interleave its ongoing tasks as they become ready to work on. This is what I mean by "event-driven" programming, and enabling it was the primary reason why Mojo was invented when other pretty-good frameworks already existed. Catalyst and Dancer can't do this (well) because they weren't designed for it.
You can also run a Mojo app behind a Plack server that forks, and in that case the server layer only ever passes one connection to Mojo at a time, defeating the event-driven features and making it just like a worker process of any other framework. But, note that the plack server (like Gazelle) is the one forking, not Mojo, in this case.
You can also attempt to run i.e. Catalyst or Dancer behind a event-driven Plack server like Twiggy, and then you have to worry about all the popular plugins for those frameworks which will make blocking calls to DNS or DB or Redis or etc. and mess up your event-driven throughput and hang your other requests. I also suspect that the database models for those frameworks would not take well to being used in event-driven style, and very likely conflict on using the same DB handle, but I've never tried.
So, back to your original question, it sounds like you actually were asking "what happens when a pre-forking web app started with a database connection before it forked off its first worker, and then the worker tries to use that same handle". And I don't actually know the answer to that, but I know it's not really a problem because I've been using pre-forked apps for more than a decade and some of them definitely open a connection to the DB before they fork. Does it result in a wasted DB connection? Maybe. I've never been constrained by resources to the point where one additional connection would hurt. If you have such a large web-app that the number of front-end hosts you run the app on is large (like, 50 different hosts, each with a master process and pool of workers) then maybe you'd care about those 50 wasted connections.
Actually I think I looked into it once and found that DBD::mysql checks the process ID to see if it's the same as the one that opened the connection, and if not it makes a new connection. That's probably how they avoid two processes talking on the same socket. It's been a long time though and things may have changed. But they still work :-)
Also, mysql conenctions are extremely cheap and plentiful, so no need to worry about those. Postgres connections are cheap, but less plentiful, and supposedly large webapps (hundreds of workers) on postgres will need to use pgbouncer to avoid running out of connection slots. SQL Server connections are more expensive, but the library does a weird multiplexing thing on the same socket so that connections seem cheap after you've already connected once. I've never used Oracle. And finally, SQLite I believe always runs in the same thread that made the request, so you can't do parallel event-driven stuff with it anyway.
| [reply] |
it sounds like you actually were asking "what happens when a pre-forki
+ng web app started with a database connection before it forked off it
+s first worker, and then the worker tries to use that same handle".
Yes and thanks for the information. I will have this in mind when proceeding. | [reply] [d/l] |