One approach is to use OpenPBS a portable batch system. It includes queueing, parallel processing, prioritization, and more. It is open source and works on Unix systems. There is a perl binding PBS::Client that provides a nice interface:
use PBS::Client;
# Create a client object linked to a server
my $client = PBS::Client->new;
# Describe the job
my $job = PBS::Client::Job->new(
%job_options, # e.g. queue => 'queue_1', mem => '800
+mb'
cmd => \@commands
);
# Optionally, re-organize the commands to a number of queues
$job->pack(numQ => $numQ);
# Submit job
$client->qsub($job);
Alternatively, a pure perl solution could be built around Schedule::Depend. Hmm, it was up to v. 2.6, but doesn't seem to be on CPAN any more...
Update: Fixed the OpenPBS link.
| [reply] [d/l] |
I extended an existing system based on XML RPC to do something similar (see the Frontier::Daemon and Frontier::Client modules for more info on setting up an XML RPC system). Using that approach, you could do something like this:
- Create a centralized queue server that accepts and queues jobs to be run. A system to track the completion of jobs could be built into this server. In addition, you'll need a system to track which machines are available and which are busy (the max number of running jobs allowed, as defined by you).
- When a job is completed, an XML RPC message can be sent back to the job tracking server.
There are probably much more efficient ways to approach this, but the XML RPC system worked well in our case because it fit into our current system (we already had one set up) and it was extremely easy to implement during refactoring. Specifically, data structures can be passed as parameters without having to process and pass them on the command line or writing them to a data file. Those parameters could include data, input/output filenames, the number of machines to run on, the priority of the job, etc.
If you need more complex communication (IPC, for example), XML RPC is probably not the best choice.
The gory details:
I'm sure more experienced and savvy monks will know of more efficient ways to approach this problem. I look forward to reading their responses.
| [reply] |
Distributed::Process might help. To use it you have to define a routine to do all of the work on the client side. The module takes care of the network communication and starting the routine on all of your clients.
You could write your own client programs using Net::Server. Then it's up to you to write a controlling program that contacts all of your client processes and start the job running.
If you do not want (or need) the client processes to listen to the network all of the time another way is to write a controlling program that logs into each of the client machines with SSH or RSH and runs a client script.
| [reply] |
How is the work to be done scheduled and otherwise managed? Do you need to be able to kill tasks once they start executing. Do you need to be able to remove tasks from the queue? Is it sufficient to have a single "reliable" system as the manager or should the management be distributed in some fashion? Do you require task result reporting? Do the tasks have to contend for central resources?
An "easy" way to do something like this is to use an email server to manage the queue. The task servers then check the queue from time to time and fetch tasks by reading and deleting an email from the queue. Because the email can contain pretty much anything it is easy to tailor tasks and distribute any information that is required for execution of the task.
DWIM is Perl's answer to Gödel
| [reply] |