I want to split the burden of processing a huge file of data over a number of instances. Basically, split the starting file in parts, give each instance a part to work upon, then collect and (re-)assemble the result data when it becomes available. The parts can be processed independently one of each other, it's basically a validation.
I have something similar working fine with forks - but I'd want to check how it does when splitting the job over a number of different machines.
The take-that/give me what you've done part is where I'm looking for solutions, as I'm unsure about how to communicate with the (remote) processes that do there work on the launched instances.
Thanks.