in reply to Converting a parallel-serial shell script

I don't see what's the problem with forking, maybe I'm just missing something. What I would do:

Create a private semaphore
Fork to create N processes
In every process
- convert its part
- lock the semaphore
- do the db import
- raise the semaphore

  • Comment on Re: Converting a parallel-serial shell script

Replies are listed 'Best First'.
Re^2: Converting a parallel-serial shell script
by Corion (Patriarch) on Sep 18, 2008 at 11:51 UTC

    "The semaphore" goes into file locking territory. And on operating systems where file locks are only advisory, I want to avoid file locking or at least prefer a prepackaged solution. But your approach of forking the whole script would have the advantage of making the script simple. It would have the disadvantage that I couldn't aggregate the results of the "child" scripts, but I don't need that for the application at hand.

      What if you simply spawn children to do the conversions and then have the parent import the results into the database as each one completes its task? There is only one parent, so the imports will be nice and sequential.

      You don't need locked files if you use files creatively: Have the children put their results into a results###.tmp file. After they have finished, they can rename the file to results###.tsv

      The parent simply waits for the .tsv files to appear, and the files are complete as soon as they appear.

      Or, if the file rename is not atomic enough, you can instead create a flag file results###.tsv.done after the .tsv is finished. The .tsv will be completed and closed and completely ready to be imported before the flag file appears