in reply to Daemonizing (or otherwise speeding up) a high-overhead script?

There is PPerl, which basically implements a prefork server for arbitrary scripts. The idea is that it launches, does the costly initialization, and then forks into the background. Then, if you ever launch the script again, it will connect to the background server and skip the costly initialization.

The problem is, that fork and database handles (like all external resources) don't play well together and you have to reinitialize after each fork.

The other replies already have recommended frameworks like Mojolicious to implement a small server, and I think this is a sound approach. Personally, I would look at using any kind of job queue, be it directory/file based or database based. For example Minion is such a job queue that also has a monitoring UI etc.

This will mean you split your code into the script/frontend to submit jobs, and a "worker" component that does the costly initialization and then works on submitted jobs. The workers pick jobs from the queue and depending on machine usage etc. you can launch more workers or kill them.

Update: I have to retract my recommendation of Minion for this situation because it forks a new instance for every job. Forking a new instance for each job means connecting to the database for each single job. In the quick scan I didn't see a way to have one worker process multiple jobs before it quits the program.

  • Comment on Re: Daemonizing (or otherwise speeding up) a high-overhead script?

Replies are listed 'Best First'.
Re^2: Daemonizing (or otherwise speeding up) a high-overhead script?
by cavac (Prior) on Aug 24, 2023 at 10:27 UTC

    I agree, splitting background tasks into dedicated, small workers with proper job queueing is certainly the way to go.

    In my systems, i have various "tasks to do" tables the worker work on. The workers run all the time, just waiting for new jobs to be scheduled. I also do this for time based scheduling. It's often times better to run the "do something every 5 minutes" stuff internally in the worker, instead of calling it from a cron job. And in many (if not most) cases it's really "once per hour" instead of "at the start of every hour". That way, you can spread out the server load a bit better.

    Whenever i need a worker to react in somewhat of a realtime manner (for example, processing and printing an invoice after the user has finished input), i add an IPC (interprocess communication) "trigger" to start checking the database (or just doing whatever needs to be done) NOW.

    Shameless plug: In my projects i use Net::Clacks for IPC, see also this slightly outdated example: Interprocess messaging with Net::Clacks

    PerlMonks XP is useless? Not anymore: XPD - Do more with your PerlMonks XP