http://qs1969.pair.com?node_id=239918

Nitrox has asked for the wisdom of the Perl Monks concerning the following question:

In a current project, I have the need to make sure that a script can only run a single copy at a time. I'd like to hear comments on the sub routine below, which gets called at the beginning of a script. I'm hoping to 'solidify' it enough for reuse in my workplace (across our Unix and Win32 environments).

The latest tweak I've been contemplating is whether calling flock() on PID_FILE would gain me anything. Perhaps preventing a race condition if 2 processes were launched simultaneously?

sub single_instance{ $SIG{INT} = sub {exit()}; my ($prog) = $0 =~ m|(?:.*[/\\])?(.*)$|; my $tmp_dir = $^O =~ "MSWin32" ? "C:/Windows/Temp" : "/tmp"; my $lockfile = "$tmp_dir/$prog.pid"; local *PID_FILE; if (-e $lockfile){ open(PID_FILE, "<$lockfile") or die("single_instance(): Can't +open $lockfile: $!\n"); my $running_pid = <PID_FILE>; close(PID_FILE); chomp($running_pid); if (kill 0, $running_pid){ die("Already running as pid $running_pid\n"); } } open(PID_FILE, "+>$lockfile") or die("single_instance(): Can't ope +n $lockfile: $!\n"); print PID_FILE "$$\n"; close(PID_FILE); chmod(0666, $lockfile); eval "END {unlink '$lockfile'}"; die("single_instance(): $@\n") if($@); }

-Nitrox

Replies are listed 'Best First'.
Re: Determine if script is already running
by Abigail-II (Bishop) on Mar 03, 2003 at 00:59 UTC
    Some obvious problems: there's a race condition. If two instances start at the same time, they might both determine the lockfile isn't there (the -e failing), and both create the file and write the PID in it.

    Furthermore, if the program doesn't get the chance to run the END block (kill -9, exec), it might happen that the next time an instance is run, it finds the lockfile, and it just happens there's an unrelated process running with the same PID as mentioned in the PID file.

    I'd say, open the file for read/write/create, and then try to gain an exclusive lock (non-blocking). If you get the lock, you're the only instance running. Keep the lock until the program terminates. Regardless how the process terminates, the kernel will release the lock (but make sure you pick a local file, not one mounted by NFS).

    Abigail

      Abigail, I tried to code for the possibility of a kill -9, (and Ctrl-C for that matter) and looked right past the simplicity of using a persistent lock for the scripts run duration. Thanks for clearing the fog. :)

      -Nitrox

Re: Determine if script is already running
by Limbic~Region (Chancellor) on Mar 03, 2003 at 00:59 UTC
    Nitrox,
    You mentioned that you are trying to make this work cross platform, this creates a problem. In Unix, it is possible to delete a file that is locked. I would suggest you use $$ and Proc::ProcessTable. It says Unix, but it does have a port to Cygwin.

    Cheers - L~R

Re: Determine if script is already running
by caedes (Pilgrim) on Mar 03, 2003 at 01:04 UTC
    Yes it does appear that you have a race condition in that code. Since what you are doing is very similar to the needs of a read/write operation I'd suggest using a second lock file (called a semaphore file) to lock the whole operation between testing the existance of PID_LOCK and finally writing $$ to the file. Eg.
    sub single_instance{ $SIG{INT} = sub {exit()}; my ($prog) = $0 =~ m|(?:.*[/\\])?(.*)$|; my $tmp_dir = $^O =~ "MSWin32" ? "C:/Windows/Temp" : "/tmp"; my $lockfile = "$tmp_dir/$prog.pid"; local *PID_FILE; open(SLOCK,">$tmp_dir/$prog.slock") or die("single_instance(): Can +'t open slock: $!\n"); flock(SLOCK,2); if (-e $lockfile){ open(PID_FILE, "<$lockfile") or die("single_instance(): Can't +open $lockfile: $!\n"); my $running_pid = <PID_FILE>; close(PID_FILE); chomp($running_pid); if (kill 0, $running_pid){ die("Already running as pid $running_pid\n"); } } open(PID_FILE, "+>$lockfile") or die("single_instance(): Can't ope +n $lockfile: $!\n"); print PID_FILE "$$\n"; close(PID_FILE); close(SLOCK); chmod(0666, $lockfile); eval "END {unlink '$lockfile'}"; die("single_instance(): $@\n") if($@); }

    This avoids the race condition because whichever process that first gets exclusive lock on the semephore file is able to read and write to the lock file before the other process can even test for the lock file's existance.

    -caedes

Re: Determine if script is already running
by BrowserUk (Patriarch) on Mar 03, 2003 at 05:27 UTC

    This is more a question than an answer.

    A few years ago I was developing a system some parts of which ran on HPUX 10.20 servers, and we needed to have (single copy of) a small deamon process that monitored other processes in the system and re-started them if they hung or died. Some of the guys were playing around with flag files (actually, they were using directories) when one of the local hpux experts wandered by for another reason and suggested Sys V semaphores.

    This question triggered the memory and I did a scan of CPAN and found IPC::Semaphore.

    Is there some reason why you couldn't combine that with Win32::Semaphore or Win32::Mutex to acheive this?


    Examine what is said, not who speaks.
    1) When a distinguished but elderly scientist states that something is possible, he is almost certainly right. When he states that something is impossible, he is very probably wrong.
    2) The only way of discovering the limits of the possible is to venture a little way past them into the impossible
    3) Any sufficiently advanced technology is indistinguishable from magic.
    Arthur C. Clarke.
Re: Determine if script is already running
by guha (Priest) on Mar 03, 2003 at 09:29 UTC

    As usually there is something to be learned from our mighty merlyn!

    Check out this Web Techniques Column which IMHO is closely related to what your trying to achieve.

      Abigail also suggested a persistent lock in her post and it's the direction I'm headed. Which brings me to another question:

      Are there any pitfalls to holding a lock for an extended period of time? (we're talking weeks here)

      -Nitrox

        Only one comes to mind: IMNSO, the only OSes I've seen that don't do better with weekly restarts are VMS and MVS (maybe OS/390, I haven't worked with it much.)

        Holding a lock for a week at a time and doing a controlled restart should be entirely safe under Windows NT 4.0 or later, any modern Linux, SCO, etc.

        Rather than re-inventing the wheel, are there parts of Big Brother or Nagios that could save you some coding time?

        Abigail: Thanks for the good idea, I've got a problem similar to Nitrox's in my current project and I like your solution an awful lot better than what I'm doing now. ++.

        --
        Spring: Forces, Coiled Again!
        Nitrox,
        As I indicated in my first post. Using a lock file by itself, regardless of type, will not guarantee only a single copy of a script is running in Unix. This is because it is possible to delete a file that is locked. Using the /tmp directory most likely increases the odds of deletion by its nature. The subsequent instance of the script is able to create and lock the new file - and now you have two copies running. You really need to have multiple methods for validation and checking the process table is a good place to start.

        Cheers - L~R

Re: Determine if script is already running
by graff (Chancellor) on Mar 03, 2003 at 01:01 UTC
    It would be helpful to know why you need to assure that only one instance of the script is running at any given time. The question would boil down to knowing what resource(s) (or what data) require the constraint: maybe it would be easier/safer to create a lock for the resource/data rather than for the process that uses it. For that matter, if it's an issue of ruling out concurrent access to a given chunk of data, maybe a DBMS (mysql or some such) could handle this job for you.

    Also, since you talk about this being employed over multiple hosts in the workplace, you should clarify whether or not the "one copy at a time" applies globally to all hosts. That is, will it be okay to have two copies running at once, so long as those two are running on separate hosts? If not, then obviously your current approach won't handle the problem of locking out other hosts that might be competing for the resource/data. (update: your approach could be extended to handle this case by adding hostname to the pid file, and doing ssh or whatever when appropriate to check for the pid on some other host)

    Apart from those concerns, and assuming that "one copy per host" is your intention, then you're present code seems good (if it works as-is on windows, which I don't know), and adding flock on the pid file probably won't improve on it much (er...) should be amended according to replies posted above.

      The script in the current project is actually a long-running daemon that creates/deletes multiple HTML files which are part of a "health check" system for a Cisco SLB device.

      I should also have clarified in my original post, "one copy at a time" is on a per-host basis.

      -Nitrox