in reply to Using perl to speed up a series of bash commands by transforming them into a single command that will run everything in parallel.

I wrote a script to demonstrate the difference between running a series of five simple two-second commands "serially" (one after the other) versus running them in parallel. Serially, it takes ten seconds. Parallelly, it takes two...
touch a; ls -l a; sleep 2

I think your choice of commands for demonstration is a bit too simple -- to the extent that the results may be misleading.

If you parallelize any heavy processing on a single machine, you will of course see a slow down in the execution time for any single instance of the process, relative to how long it would take if it weren't running in parallel with other heavy processes.

Given the nature of multi-processing, there will be a trade-off point somewhere: some number N such that running N processes in parallel will be faster than running them serially, but running N+1 in parallel will be slower than, say, running (N+1)/2 in parallel, followed serially by running the remainder in parallel.

Mileage will vary depending on how heavy the processing is, and what resources are needed most: memory-bound, cpu-bound and io-bound jobs might show slightly different trade-offs, depending on how you combine them and what your hardware happens to be.

  • Comment on Re: Using perl to speed up a series of bash commands by transforming them into a single command that will run everything in parallel.
  • Download Code

Replies are listed 'Best First'.
Re^2: Using perl to speed up a series of bash commands by transforming them into a single command that will run everything in parallel.
by Anonymous Monk on Jun 12, 2006 at 13:11 UTC
    This is completely true. The only reason that this looks faster is because the perl scripts are sleeping for 2 seconds. If they each had 2000 milliseconds worth of processing to do, running them in parallel would still leave you with 10 seconds of processor time needed. Parallelism doesn't magically give you more processors. Parallel processing is useful when you can compute lots of partial results simultaneously that can then be combined as inputs to another algorithm.