http://qs1969.pair.com?node_id=529600

clinton has asked for the wisdom of the Perl Monks concerning the following question:

Hi all

I need to implement a cron daemon which runs jobs such as sending emails, clearing caches, updating search indexes etc.

The whole daemony forky thing is foreign to me. I've spent some time looking at Proc::Daemon, Schedule::Cron, perlipc etc, but I'm a bit confused as to where to begin and what is possible/easy.

Essentially, what I want to achieve is:

Dispatcher daemon - allows modules to register with it - when a job is scheduled, calls the job and forks -- job tries to obtain a lock -- if it fails, count++ if count > max --> warn -- if lock obtained -- run job to completion -- appends messages to a log shared by all jobs -- unlock -- exit -- if receives a SIGINT -- stops all forked children -- exits
The forked process should not share data with the parent process, and each time a job starts, it should start with a fresh slate - this I assume is an 'of course' with fork, but what if there were just SOME data that I would like to be loaded only once and shared with the forked processes, is this possible?

The details I can figure out, but some pointers would be appreciated.

many thanks

Replies are listed 'Best First'.
Re: A perl daemon
by GrandFather (Saint) on Feb 11, 2006 at 21:58 UTC
Re: A perl daemon
by sh1tn (Priest) on Feb 11, 2006 at 22:25 UTC
    Or just perldoc -q daemon.


Re: A perl daemon
by Perl Mouse (Chaplain) on Feb 11, 2006 at 21:50 UTC
    On a fork, the process forks. They will have (almost) the same state (things like PID will differ). Both copies will have the same data - but what you change in one copy won't change in the other.
    Perl --((8:>*
Re: A perl daemon
by clinton (Priest) on Jun 08, 2006 at 10:55 UTC
    This node gets a fair bit of traffic, so I thought I'd post the code that I used to implement the above. The reason that I record the number of attempts is that, some jobs that run frequently may still be running when the next job tries to start. If they are STILL running after 5 attempts, something is wrong and you should probably kill the process.

    The only difference in the logic is that the check for whether the job is already running happens after the fork, instead of before, as I was getting locking issues with database handles being passed from parent to child.

    It also emails you if any errors occur, or a job fails to run.

    I'd welcome comments on the code. It could be CPANed quite easily - just need to add hooks for the database calls.

      I found this code potentially very handy. However, I found that the dispatcher function within the class is a new process every time it executes. So if I update the cron, the changes are lost once the function completes.

      If I create something similar as a script (not as a daemon), the schedule doesn't fork off a new process each time the function is called.

      Is there a way to mimic this functionality (non-forking) by using a daemon similar to the above example?

        However, I found that the dispatcher function within the class is a new process every time it executes. So if I update the cron, the changes are lost once the function completes.

        You mean, if you were to update the cron table from one of the jobs? Yes, those changes would be lost.

        I've been using the example code above in production for 2 years now, and it runs beautifully. I have a large application, which gets loaded at startup, and using fork to launch each process on linux is really cheap (as it makes use of Copy on Write). It also means that different jobs can run simultaneously.

        In Schedule::Cron there is an option nofork, which (not surprisingly) launches the jobs without forking :) - this would let you alter your cron table, but would only run jobs sequentially.

        Instead of that, you could consider making your daemon re-read the cron table whenever it receives a signal of your choosing, eg $SIG{USR}, then your child job could update the cron table, and signal its parent.