in reply to Using threads to run multiple external processes at the same time

Let me answer your last question first: I don't know any (and I don't think that there is a stable, easy) way to extend a threaded program over multiple workstations. (Remember: I said "easy". :-) ).

Here is what I'ld do to extend this over multiple computers with as little time as possible:

  • Create a NFS or other shared directory which could be used by all computers.
  • Split your job in smaller files and put them into the shared directory
  • Start a worker script on each computer which does the following:
  • If you want to use threads, have a look at perlthrtut and maybe perlothrtut. These are good manpages and I couldn't write it better here.
    When using a Queue (described there), you should be safe from race and other bad conditions, but be sure to "my" all variables in each sub and advoid slow thread-shared variables whenever possible. Also check all modules used by the threaded part if they're thread-safe!

    Personally, I'ld prefer the first method. It's not as simple as a thread queue once you learned how to use the queue, but it could be spread over a huge amount of computer. A launcher which detects the number of CPUs (/proc/cpuinfo) and starts one process per CPU should be easy to write and could be started as a CGI if you secure the URL (.htaccess?).

    • Comment on Re: Using threads to run multiple external processes at the same time