Baz has asked for the wisdom of the Perl Monks concerning the following question:

Hi,
I have scheduled crontab to run a perl script every hour. This script pops a Surname from the top line of a file and then computes stats for this name. The thing is though; some names take much longer to process than others. This varies from a few minutes to 2 hours.
What would you guys recommend I do? Is there any perl functionality I could use to to manage this scenario - and insure that each exactable has the recourses it needs before it executes.
The script is also communicating with another server, and for this reason I don’t want to run the jobs as a batch, I want an interval between each call.
thanks.

Replies are listed 'Best First'.
Re: Cron Job Management
by no_slogan (Deacon) on Jun 10, 2002 at 19:08 UTC
    You could have the crontabbed script flock a lockfile with LOCK_EX|LOCK_NB. If the lock fails, the file is already locked, so just exit and let the previous invocation of the script finish its work.

    If you need smarter scheduling behavior (like dealing with different-length jobs intelligently, or checking for avaliable resources), you'll probably have to use something other than cron. A simple solution would be to have the script run forever, with a big sleep between jobs.

    At my previous place of employment, we had a saying that any script will eventually grow into a scheduling engine. That's a much more complicated solution. Basically, you maintain a list of jobs that can be run, and start one of them whenever the time is right. You can move jobs in and out of this "run queue" as the required resources come and go. You can set various rules for when you start a new job, like how much time after the previous job, or how many requests per minute to the server, or whatever. In some cases, you might want to run multiple jobs at once by having the main process fork off a child to handle each job, then wait for the children to exit. The number of children running at once can be adjusted based on system load, jobs that run for too long can be killed, and so on. If we'd had a clue, maybe we would have written a generic scheduling engine instead of just building a new one from scratch every time. And maybe we'd still be in business.

    Update: Try CPAN, maybe you'll find something that does what you want.

      Dunno about CPAN, but merlyn wrote Highlander, which does pretty much the same thing as you describe.

          --k.


Re: Cron Job Management
by bnh (Novice) on Jun 10, 2002 at 21:28 UTC
    could look at..... Schedule::Parallel; Parallel::ForkManager;
Re: Cron Job Management
by gav^ (Curate) on Jun 10, 2002 at 21:31 UTC
    Using Proc::PID_File is a pretty straightforward way of making sure your process is only running once:
    use Proc::PID_File; exit if hold_pid_file("/tmp/myscripts_name_here.pid");

    gav^