Tanalis has asked for the wisdom of the Perl Monks concerning the following question:

Heys all,

I'm in the middle of designing/writing a script for work that involves varying a subset of a large set of numbers many times, then feeding the entire set of numbers through a pre-written analysis engine, which takes around 15 minutes to complete its analysis each time. The subset of numbers, and the amount they vary, changes with each cycle of the script.

We have no shortage of computers and/or processors in my workplace, and what I'd like to do is distribute the analysis over more than one computer (well, processor - some computers have multiple) in the department. I've designed a script to do just that, which is below, and involves forking off a child process for each call to the analysis engine, rsh-ing to a different computer and running the engine there.

Note that @data is an array of arrays, each containing the set of numbers to be analysed, and that @computers is a list of computers available to me. This code is untested; I'm sure there's a couple of bugs somewhere - as I said, I'm designing atm, implementing comes later, I hope.

my $runningprocesses; foreach $element (@data) { ++$runningprocesses if my $pid = fork(); if ($pid == 0) { # in child process system("rsh $computers[$runningprocesses % 10] 'analyse @$element +'"); exit; } } while ($runningprocesses > 0) { wait; # wait for all the children to return --$runningprocesses; }

While I'm fairly sure this'd do what I need, it doesn't seem particularly elegant - and I'm wondering if anyone has any suggestions about better ways to either distribute work or to manage child processes. There will be at least ten calls to the analysis engine, and as runtime is limited (the script will be regularly run) distribution becomes something of a necessesity.

Any advice, suggestions, insults, or whatever people want to throw at me would be greatly appreciated. This is my first foray into anything like this (distribution, not forking and certainly not Perl) and I'm sure there's a better way.

Thanks in advance
-- Foxcub

Replies are listed 'Best First'.
Re: Child Process Management and Distributed Systems
by princepawn (Parson) on Oct 03, 2002 at 20:51 UTC
Re: Child Process Management and Distributed Systems
by perrin (Chancellor) on Oct 03, 2002 at 22:54 UTC
Re: Child Process Management and Distributed Systems
by rcaputo (Chaplain) on Oct 04, 2002 at 07:36 UTC

    Using remote shells is a common way to distribute tasks and possibly the least troublesome one, provided that your network consists of identical machines. Configuring your remote tasks can be interesting (for "frustrating" values of interesting) if you plan to run them across a heterogenous network.

    I learned this the hard way while writing a POE based program to build testing environments and test software on several machines at once. They were selected to represent a wide spectrum of architectures and operating systems, which greatly complicated things.

    You could also set up job servers on the machines where you want to run your analyses. A master control program could transmit jobs to them and receive back responses when jobs complete. If you're running this out of cron, the job servers could even send asynchronous replies by some other means (perhaps e-mail). POE's cookbook has two recipes for job daemons. One covers non-web application clients and servers. The other implements an interactive job server.

    The cookbook also includes a few different ways to manage child processes. It should not be hard to adapt one of them to run tasks remotely.

    -- Rocco Caputo