Re: Database access and async web framework (like Mojolicious)
by Arunbear (Prior) on Dec 12, 2024 at 12:15 UTC
|
Mojolicious itself does not seem to use Perl threads.
% git remote -v
origin git@github.com:mojolicious/mojo.git (fetch)
origin git@github.com:mojolicious/mojo.git (push)
% ls
Changes examples lib LICENSE Makefile.PL MANIFEST.SKIP README.md
+ script t
% rg 'use threads'
% rg 'threads'
%
The SQLite docs indicate it may not be suitable if your workflow involves a high degree of concurrent writing, but you'd need to assess/test if you're likely to reach that point.
Using Time::HiRes, you could check how expensive getting the handle really is. Chances are it may not be significant.
| [reply] [d/l] |
Re: Database access and async web framework (like Mojolicious)
by Corion (Patriarch) on Dec 12, 2024 at 11:51 UTC
|
I follow very much the same strategy, using DBD::SQLite and potentially later switching to a different database.
The main change I expect is switching the connection string to the next database, and that should be more a configuration change than a program change.
I recently thought I needed (and added) a "reconnect" feature to release/refresh database handles after hibernation (in MooX::Role::DBIConnection). But overall, I try to keep little abstraction between me and DBI.
As long as you stay away from session-specific changes, having a pool of database handles is fair game and is the approach I would always try to use.
| [reply] |
Re: Database access and async web framework (like Mojolicious)
by hippo (Archbishop) on Dec 12, 2024 at 12:03 UTC
|
Honestly, if an even vaguely modern Web Framework is not properly handling DB connections for you then that's a huge oversight. The advantage of using a framework is that it should take care of all that sort of low-level plumbing so you don't have to. If you are ending up rolling your own connection pooler/handler/sharer/queuer/whatever then maybe consider an alternative framework which already has what you need.
| [reply] |
|
|
| [reply] |
Re: Database access and async web framework (like Mojolicious)
by choroba (Cardinal) on Dec 12, 2024 at 11:56 UTC
|
Also note that SQLite locks the database file for certain operations. You might need to retry if several processes are talking to the database at the same time.
map{substr$_->[0],$_->[1]||0,1}[\*||{},3],[[]],[ref qr-1,-,-1],[{}],[sub{}^*ARGV,3]
| [reply] [d/l] |
Re: Database access and async web framework (like Mojolicious)
by Marshall (Canon) on Dec 12, 2024 at 13:22 UTC
|
I recommend that you study Using SQLite In Multi-Threaded Applications.
My concern is that because of the use of threads in such frameworks, keeping a DB handle over time will lead to problems. (let's say I keep the handle for 100 queries, then refresh it) There is no problem keeping a handle for a very long time. There is no need to "refresh" it. A handle and any statement prepared for that handle can only be used in a single thread. Read my above suggested link.
Update: SQLite requires an exclusive lock when writing the DB. It will take care of doing this by itself, but be aware that this can have a huge impact upon performance. Fixed bad link above. | [reply] |
Re: Database access and async web framework (like Mojolicious)
by karlgoethebier (Abbot) on Dec 14, 2024 at 15:39 UTC
|
It might be worthwhile to take a look at DBIx::Connector.
From ibidem:
”You can store the connection somewhere in your app where you can easily access it, and for as long as it remains in scope, it will try its hardest to maintain a database connection. Even across forks (especially with DBI 1.614 and higher) and new threads, and even calls to $conn->dbh->disconnect. When you don't need it anymore, let it go out of scope and the database connection will be closed.”
| [reply] |
Re: Database access and async web framework (like Mojolicious)
by cavac (Prior) on Dec 13, 2024 at 12:19 UTC
|
I can only speak from experience with my own web framework. Long running (background) processes keep their DB handle open (often for days on end). Web uses two different strategies: Occasional access is on-demand (XMLHttpRequest from JavaScript). Long running stuff keeps a websocket open, which means the web backend also keeps a database handle open.
I use PostgreSQL as the database. I usually connect via UDS (unix domain sockets), this is much faster and less resource intensive than TCP. Give PostgreSQL a try, it's (in my opinion) much better at dealing with hundreds of connections than SQLite, and it's also much more capable at dealing with large datasets and complicated queries¹
¹ Your application might not have reached that point yet. But that stuff has a tendency to sneak up on you. It's always better to switch to a "big" database rather sooner than later. PostgreSQL is easy to install. If you are using any kind of Linux, there's most likely a package waiting for you in your package manager.
| [reply] |
|
|
thanks, I am now giving postgres a try. It's a rough ride. Still looking how to login as admin, nobody asked me to set any admin password. Once I get in there I have to disable the initdb: warning: enabling "trust" authentication for local connections. I am now trying to find that dreaded config file. And disable ports, enabling only local socket access. Oh yes rant off. But still interested to see how it is going to be after this initial hiccups.
| [reply] [d/l] |
|
|
| [reply] [d/l] |
|
|
|
|
|
Re: Database access and async web framework (like Mojolicious)
by NERDVANA (Priest) on Dec 19, 2024 at 16:16 UTC
|
Mojolicious does not use threads, but it does use event-driven callbacks. Maybe that's what you meant? It is true that you can't use regular DBI methods for the queries if you are expecting to respond to event-driven requests. However, I don't think SQLite can be event-driven anyway, because it is a library that runs in your own thread rather than acting like a server which you could asynchronously exchange events with.
That said, Mojo doesn't *need* to be used event-driven. You can run a pre-forked worker pool of Mojo app processes just like any of the other frameworks. In this mode, you just use DBI like normal. But, if you want to make use of Websockets, you do need the event-driven style and an event-driven server like Twiggy.
For Mojo to asynchronously talk to a database and respond to events when the queries complete, while processing other requests in the meantime, you need to use the custom database connectors like Mojo::Pg, and use the promise-returning methods like query_p.
| [reply] [d/l] |
|
|
... My concern is that because of the use of threads in such framework
+s, keeping a DB handle over time will lead to problems.
I think it is a pretty sensible thing to ask: what happens to the parent's DB handle on fork and what's the good practice in this case. Perhaps a db handle should not be created on the parent (i.e. before a fork) but only created on each child (after forking), and closes when the child's life ends. But that excludes the use of DB handles pool and also the persistent use of a DB handle "for ever" within the server (and not within on each of its children).
I have updated my question to include this clarification and also test code for a simple mojo server ... which unfortunately refuses to fork! | [reply] [d/l] |
|
|
By default, Mojo still doesn't fork :-) It is designed specifically to enable the style of programming where you have only one process and it reacts to an incoming connection (event) to do a little processing, and if it needs to do a slow operation like a database query it starts that operation (linked to a callback) and then returns to look for more incoming connections. If the next event is a new request, it also begins that request and may also leave it hanging on a long operation linked to a callback. If one of the operations completes (database returns results) that is an event which gets picked up next and executed some more, possibly finishing the request and sending it back to the client. At any given moment, there is still only one thread of one process doing the work, but it can interleave its ongoing tasks as they become ready to work on. This is what I mean by "event-driven" programming, and enabling it was the primary reason why Mojo was invented when other pretty-good frameworks already existed. Catalyst and Dancer can't do this (well) because they weren't designed for it.
You can also run a Mojo app behind a Plack server that forks, and in that case the server layer only ever passes one connection to Mojo at a time, defeating the event-driven features and making it just like a worker process of any other framework. But, note that the plack server (like Gazelle) is the one forking, not Mojo, in this case.
You can also attempt to run i.e. Catalyst or Dancer behind a event-driven Plack server like Twiggy, and then you have to worry about all the popular plugins for those frameworks which will make blocking calls to DNS or DB or Redis or etc. and mess up your event-driven throughput and hang your other requests. I also suspect that the database models for those frameworks would not take well to being used in event-driven style, and very likely conflict on using the same DB handle, but I've never tried.
So, back to your original question, it sounds like you actually were asking "what happens when a pre-forking web app started with a database connection before it forked off its first worker, and then the worker tries to use that same handle". And I don't actually know the answer to that, but I know it's not really a problem because I've been using pre-forked apps for more than a decade and some of them definitely open a connection to the DB before they fork. Does it result in a wasted DB connection? Maybe. I've never been constrained by resources to the point where one additional connection would hurt. If you have such a large web-app that the number of front-end hosts you run the app on is large (like, 50 different hosts, each with a master process and pool of workers) then maybe you'd care about those 50 wasted connections.
Actually I think I looked into it once and found that DBD::mysql checks the process ID to see if it's the same as the one that opened the connection, and if not it makes a new connection. That's probably how they avoid two processes talking on the same socket. It's been a long time though and things may have changed. But they still work :-)
Also, mysql conenctions are extremely cheap and plentiful, so no need to worry about those. Postgres connections are cheap, but less plentiful, and supposedly large webapps (hundreds of workers) on postgres will need to use pgbouncer to avoid running out of connection slots. SQL Server connections are more expensive, but the library does a weird multiplexing thing on the same socket so that connections seem cheap after you've already connected once. I've never used Oracle. And finally, SQLite I believe always runs in the same thread that made the request, so you can't do parallel event-driven stuff with it anyway.
| [reply] |
|
|
Re: Database access and async web framework (like Mojolicious)
by stevieb (Canon) on Dec 13, 2024 at 09:40 UTC
|
My concern is that because of the use of threads in such frameworks, keeping a DB handle over time will lead to problems.
The only way to quell or prove the anecdotal theories you have, is to test for yourself. Take the time to set up whatever DB configuration you want, then write an automated test suite to hammer the db code until it either breaks, or is usable beyond what you perceive to be acceptable.
FWIW, I prefer Dancer2, but I digress.
| [reply] |
|
|
Can you say more about why you prefer Dancer?
| [reply] |