Tanalis has asked for the wisdom of the Perl Monks concerning the following question:
I'm in the middle of designing/writing a script for work that involves varying a subset of a large set of numbers many times, then feeding the entire set of numbers through a pre-written analysis engine, which takes around 15 minutes to complete its analysis each time. The subset of numbers, and the amount they vary, changes with each cycle of the script.
We have no shortage of computers and/or processors in my workplace, and what I'd like to do is distribute the analysis over more than one computer (well, processor - some computers have multiple) in the department. I've designed a script to do just that, which is below, and involves forking off a child process for each call to the analysis engine, rsh-ing to a different computer and running the engine there.
Note that @data is an array of arrays, each containing the set of numbers to be analysed, and that @computers is a list of computers available to me. This code is untested; I'm sure there's a couple of bugs somewhere - as I said, I'm designing atm, implementing comes later, I hope.
my $runningprocesses; foreach $element (@data) { ++$runningprocesses if my $pid = fork(); if ($pid == 0) { # in child process system("rsh $computers[$runningprocesses % 10] 'analyse @$element +'"); exit; } } while ($runningprocesses > 0) { wait; # wait for all the children to return --$runningprocesses; }
While I'm fairly sure this'd do what I need, it doesn't seem particularly elegant - and I'm wondering if anyone has any suggestions about better ways to either distribute work or to manage child processes. There will be at least ten calls to the analysis engine, and as runtime is limited (the script will be regularly run) distribution becomes something of a necessesity.
Any advice, suggestions, insults, or whatever people want to throw at me would be greatly appreciated. This is my first foray into anything like this (distribution, not forking and certainly not Perl) and I'm sure there's a better way.
Thanks in advance
-- Foxcub
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re: Child Process Management and Distributed Systems
by princepawn (Parson) on Oct 03, 2002 at 20:51 UTC | |
|
Re: Child Process Management and Distributed Systems
by perrin (Chancellor) on Oct 03, 2002 at 22:54 UTC | |
|
Re: Child Process Management and Distributed Systems
by rcaputo (Chaplain) on Oct 04, 2002 at 07:36 UTC |