pileofrogs has asked for the wisdom of the Perl Monks concerning the following question:

Whats the best way to make a script abort if an instance of it is already running? I'm on Linux.

I've done this with process id (pid) files and searching the process list from ps, but I don't know what the various pros and cons are. This seems like something that might have an established best-practice, so if you know it, I'd love to hear it.

Just in case I'm being unclear, I'll give an example. I have a process that runs every 10 minutes. It usually only takes 2 minutes to complete, but sometimes it takes 15 or 20 minutes. I don't want the new instance to step on the toes of the long-running earlier instance.

I've solved this before by the old time-honored tradition of making-something-up-as-I-go-along and I'd like to hear what other people think is the best way to handle this.

Thanks!
--Pileofrogs

Replies are listed 'Best First'.
Re: Abort if instance already running?
by Corion (Patriarch) on Jan 05, 2009 at 23:23 UTC
Re: Abort if instance already running?
by shmem (Chancellor) on Jan 05, 2009 at 23:20 UTC

    Well, the traditional way is to look for "/var/run/".basename($0)."pid". If it exists, read its contents (a PID). Lookup that PID int the process table (or test -d "/proc/$pid" and check /proc/$pid/cmdline to be sure), if positive, exit gracefully. Else overwrite "/var/run/".basename($0)."pid" with your own $$.

    There's a solution with flock(2) too... merlyn has written a column about that, IIRC (update: what Corion says :-) . Ah, wait... the best way? whatever works best on your system ;)

Re: Abort if instance already running?
by Perlbotics (Archbishop) on Jan 05, 2009 at 23:23 UTC

    Don't know if using a flag(better: lock)-file is the best way to do it, but it should work for your purpose. Something along...

    BEGIN { our $LOCKFILE = "/tmp/frogs.pid"; die "Another process is already running (see content of $LOCKFILE for + pid)...\n" if -e $LOCKFILE; open my $out, '>', $LOCKFILE or die "cannot open $LOCKFILE - $!"; print $out "$$\n"; close $out; # chmod 0600 ... if you like } warn "running... with PID: $$ ...\n"; sleep 10; END { unlink $LOCKFILE or die "cannot remove lock-file $LOCKFILE - $!"; }
    It is not race-condition proof, but 10-20 minutes in between runs is not a race condition, I guess.
    Update: Ok, when using cronjobs, there is a probability not equal to zero that a cronjob may collide one day... interacting with a manual invocation.
    Update2: Used /tmp instead of /var/run since you might not have the privileges (root) to access files there. If you do, /var/run is closer to best practises. You might also need to register SIGnal handlers to remove the lock file.

Re: Abort if instance already running?
by kyle (Abbot) on Jan 06, 2009 at 04:27 UTC
Re: Abort if instance already running?
by andreas1234567 (Vicar) on Jan 06, 2009 at 08:50 UTC
    I have used Proc::PID::File and find it very easy to use:
    use Proc::PID::File; die "Already running!" if Proc::PID::File->running();
    --
    No matter how great and destructive your problems may seem now, remember, you've probably only seen the tip of them. [1]
Re: Abort if instance already running?
by JavaFan (Canon) on Jan 05, 2009 at 23:42 UTC
    In such cases, I typically use a lock file, typically /var/run/name-of-program. When starting the program, create the file if it doesn't exist (using sysopen), then try to get a lock (nonblocking). If the process can get the lock, it knows it's the only instance running. If it cannot get the lock, another instance is running, and it exits.

    It's a technique simple enough that I can't be bothered to look for a CPAN solution. And it can easily be implemented in a different language as well.

      Excellent approach and very standard. I would also think that the poster would want to install a signal handler(s) to release appropriate locks if the program dies, get's a CTL-C or whatever. Something like this:
      $SIG{'INT'} = \&release_lock; $SIG{'QUIT'} = \&release_lock; $SIG{'PIPE'} = \&release_lock; #maybe too? sub release_lock{ #...do what you can here... }

      The basic idea is not to leave locks around if your program abends. The lock strategy is simple, works well, and can be implemented in many languages on many O/s'es (some easier than others)..just remember to clean up if "bad things" happen to your program and the normal "release lock" code doesn't execute.

      The lock system is cooperative, meaning that everybody has to know what is going on and cooperate. It is possible to write a file that is "locked" for exclusive access. A file lock is more like an advisory thing and the programs that use it have to "play by the rules".

        I think the various "nit-picky details" of using file locking like this may be a good reason to stick with the pid-file idea explained in the initial reply -- assuming of course that the goal is to make sure only one instance of a program is running on the same host where you would be trying to start a new instance.

        If for some reason there are different hosts that share a given disk and might try to start instances of this process, but only one host at a time may run the one instance of the process, then you'd have to complicate the pid-file approach somewhat, and file locking might be easier/better overall (assuming it works reasonably well on your shared file system... YMMV).

        just remember to clean up if "bad things" happen to your program and the normal "release lock" code doesn't execute.
        Needlessly complicated! I never work on an OS where a process that no longer exists continues to hold a lock on a file. On any Unix system I know of, if a program exits (be it normally, with a signal or by an error), all locks are released.

        You may need an explicit release of the lock if your program forks or execs, and you don't want the children or the new program to hold the lock.

Re: Abort if instance already running?
by lakshmananindia (Chaplain) on Jan 06, 2009 at 04:21 UTC

    I used locks in these case

    #Open a file open(SINGLECHECK,">.server1.pid") or $file->log("$0:Cannot open the lo +ck file $!\n"); #Lock the file in exclusive mode and non-blocking mode so that no two +same demon will run. flock(SINGLECHECK, LOCK_EX|LOCK_NB) or ($file->log("$0:server is Alrea +dy running\n") and exit(1));
      I'm a bit baffled by the last line. It suggests that either the log method always return a true value, or that if the logging somehow fails, the program doesn't exit.