atcroft has asked for the wisdom of the Perl Monks concerning the following question:

I am looking for a way to limit the number of concurrent executions of a script to some value x, where [1 <= x <= n ]. Limiting to 1 instance appears to be the easy case, but it seems more difficult to limit it to just some n greater than 1.

Because of the manner in which the process may be launched (a filter launched by another process-and no, not email-related), it is conceivable that several thousand instances could be launched in the span of a few minutes; however, I wish to limit this to a much smaller (and processable) number (such as less than 100). Because of the time involved (and other potential issues), I am trying to avoid solutions such as the creation of a large number of lock files (due to speed issues), or ones susceptible to race-conditions (or the like).

One suggestion I encountered suggested the use of IPC::Semaphore as a way, by allowing processes to update a shared semaphore, and exit if the value it contained exceeded my limit. (I believe I understood that correctly-please correct me if I am wrong, however.) I seem to be having difficulties implementing this, although now it seems to be working for the moment. Does it look as if I have this correct, and can anyone suggest a better way, or any issues they may see with my implementation?

#!/usr/bin/perl -w use strict; use IPC::SysV qw(IPC_PRIVATE S_IRWXU IPC_CREAT); use IPC::Semaphore; srand; my $sem = new IPC::Semaphore( 1234, 1, S_IRWXU | IPC_CREAT ); my $semval = $sem->getval(0); $semval = 0 unless ( defined($semval) ); exit(0) if ( $semval >= 10 ); $sem->setval( 0, ( $semval + 1 ) ); # # Do work... or in this case, # print the value of $semval and sleep for a while # print $semval + 1, "\n"; sleep( int( rand(20) + 1 ) ); # # Decrement the semaphore value # $semval = $sem->getval(0); if ($semval) { $sem->setval( 0, ( $semval - 1 ) ); } else { $sem->remove; } __END__ # # Sample test execution (in bash): # # index=30 # while [ $index -gt 0 ] # do # perl test.pl & # sleep 1 # let "index=index-1" # done #
  • Comment on Effective use of IPC::Semaphore to limit number of concurrent executions of a scxipt
  • Download Code

Replies are listed 'Best First'.
Re: Effective use of IPC::Semaphore to limit number of concurrent executions of a scxipt
by ptkdb (Monk) on Oct 16, 2003 at 01:22 UTC
    Reference Unix Network Programming Vol 2, Interprocess Communications by W. Richard Stevens.

    A semaphore is basically an atomic counter. It guarantees that only one counter operation(increment or decrement) will occur no matter how many processes or threads are trying to operate on it simultaneously. Also, the counter may not be decremented lower than 0. Any attempt to decrement the counter when it is zero will 'block' , unless the IPC_NOWAIT flag is set on your operations, in which case it will fail with an errno set to EAGAIN or EWOULDBLOCK. The process will block until someone increments that semaphore. Once that happens if there is more than one process waiting to decrement it back to zero, only 1 will succeed.

    Typically semaphores are used sort of like mutexes to prevent two processes from touching the same piece of shared memory at the same time.

    However, your use for them is fine as well. The only thing is that you don't want to be incrementing it using getval setval. You want to use the 'op' method instead. Your code should look something this like this:

    use IPC::SysV qw/SEM_UNDO IPC_CREAT ftok/ ; use IPC::Semaphore ; # start script $flags = PERMISSIONS ; $sem = new IPC::Semaphore(ftok($0, 0), 1, $flags) ; unless($sem) { # we must be the first one $sem = new IPC::Semaphore($id, 1, $flags | IPC_CREAT) $sem->setval(0, 100) ; } ## ## Decrement the #0 semaphore by 1 ## $sem->op(0, -1, SEM_UNDO) ; # blocks if semaphore is zero # # DO SCRIPT STUFF # ## ## Increment #0 semaphore by 1 ## $sem->op(0, 1, SEM_UNDO) ; # once past this point, any script waiting can proceed # done
    NOTES:
  • ftok generates a unique identifier based on the path that you give it. If the path doesn't exist you'll get a null
  • SEM_UNDO performs a nice service for you, if you're script 'croaks' before the script increments the semaphore, the kernel will 'undo' the collective operations on that semaphore the process performed. Ref Unix Network Programming Vol 2 pgs 174, 286-287
  • Although you probably have already, get familiar with the commands 'ipcs' and 'ipcrm'.
  • Semaphores can also be allocated in blocks of more than one to provide more sophisticated uses. You would you use 'nsems' to specify the number of semaphores you want
  • the op method can have several different operations on the block of semaphores. However in order for 'op' to succeed(i.e. not block) all of them have to occur atomically
  • At the request of the IAEA, I've been asked to explain that atomic in this context DOES NOT refer to any operation of fission, but rather refers to the quality of 1 at a time.
  • Would it be blasphemy for me to request a separate shrine for Mr Stevens?(author of the referenced material)

      If I don't want the processes to block, then would the only changes I needed to make be in

      • line 5, to include the symbols SEM_UNDO and ftok in the use IPC::SysV entry,
      • line 10, to use ftok($0, 0) instead of my original '1234' id,
      • line 17, to use op(0, 1, SEM_UNDO) in place of my setval(0, ($semval+1)), and
      • line 31, to use op(0, -1, SEM_UNDO) in place of my setval(0, ($semval-1))
      resulting in the code below (rev. of previous code)? Are there any other obvious problems (that might lead to race conditions, for example) in the code below?

      I actually don't want the extraneous ones to block, but to go ahead and exit, because other instances can pick up and process waiting data later-I just don't want to slam the machine with a huge number of processes doing heavy processing at once.

      #!/usr/bin/perl -w use strict; use IPC::SysV qw(IPC_PRIVATE S_IRWXU IPC_CREAT SEM_UNDO ftok); use IPC::Semaphore; srand; my $sem = new IPC::Semaphore( ftok( $0, 0 ), 1, S_IRWXU | IPC_CREAT ); my $semval = $sem->getval(0); $semval = 0 unless ( defined($semval) ); exit(0) if ( $semval >= 10 ); $sem->op( 0, 1, SEM_UNDO ); # # Do work... or in this case, # print the value of $semval and sleep for a while # print $semval + 1, "\n"; sleep( int( rand(20) + 1 ) ); # # Decrement the semaphore value # $semval = $sem->getval(0); if ($semval) { $sem->op( 0, -1, SEM_UNDO ); } else { $sem->remove; } __END__ # # Sample test execution (in bash): # # index=500 # while [ $index -gt 0 ] # do # perl test.pl & # let "index=index-1" # done #
        You don't need to do a 'getval(0)' to check the value of the semaphore. If you set the flags to SEM_UNDO | IPC_NOWAIT, and check the return result and $!. If $! is EAGAIN then the semaphore value was 0 so no slots would be available and $sem->op() returned without altering the semaphore's value.
        use Errno qw/EAGAIN EINTR/ ; $result = $sem->op(0, -1, SEM_UNDO | IPC_NOWAIT) if( !$result && $! == EAGAIN ) { # semaphore was zero, no slots available print "BUSY, try later\n" ; exit(0) ; } die "op failed: $!" unless $result ; # op failed for some other reason ## ## lock acquired ##
        Blocking isn't really as bad as it sounds. The process will just wait until one of other instances finishes and then increments the semaphore to something greater than 0. How long that takes depends on how many others are competing to decrement the semaphore at the same time.

        If you want, you can add a time out using 'alarm'.

        alarm(5) ; # five seconds $success = semop $result = int $! ; # capture the value of errno alarm(0) ; # cancel alarm(if still active) if( !$sucess && $result == EINTR ) { # timed out }
      • line 17: op(0, -1, SEM_UNDO) in place of setval(0, ($semval+1))
      • line 31: op(0,  1, SEM_UNDO) in place of setval(0, ($semval-1))

        Semaphores DECREMENT to lock, and INCREMENT to unlock.
•Re: Effective use of IPC::Semaphore to limit number of concurrent executions of a scxipt
by merlyn (Sage) on Oct 16, 2003 at 01:29 UTC
    The problem with using a semaphore is that the kernel isn't smart enough to decrement the counter on your behalf if a process happens to abort early.

    A better solution might be to implement an extended "highlander"-style solution as presented in my column that flocks one of N files instead of just one file. Then, if the process aborts for any reason, the flock is dropped, and a new process can claim its rightful slotishness.

    I've had that extension to the highlander solution on my column to-do list for a number of months; perhaps it is time to finally write it up.

    -- Randal L. Schwartz, Perl hacker
    Be sure to read my standard disclaimer if this is a reply.

      I beg to differ on what happens when the process aborts. If you have the SEM_UNDO flag set on your operations, all of the collective operations on that semaphore are 'undone'. I refer to the semop man page, and will see what my other reference material says about it in the morning.