Win has asked for the wisdom of the Perl Monks concerning the following question:

I want to do a bit of distributed computing. This is new to me and so I am not familiar with the language that is used to describe it. Anyway, my idea is as follows:

A maximum of 4 computers are running this Perl program, if they are all on.
I want the following steps to be performed:


A. Background monitoring for incoming user request files.

B. Detection of an incoming request file to a network drive.

C. Determination of which of 4 computers are available to handle the allocation of the request. If the more senior computers of the 4 are running then the following request handling is left to the most senior. If the senior PCs do not respond then the junior PCs respond in time. Subordinate PCs return to stage A.
D. The contents of 4 folders (on a network drive) are read (each folder corresponds to one of the 4 PCs).

E. Measurement of the sum of file sizes for all files ending with ‘.txt ‘, for each of these folders, is made.

F. ‘Request txt file’ is copied into the folder with the least sum of file sizes.

G. If all the folders are at the same status I want to briefly assess CPU activity for all the PCs that are on (if this number is >2) and allocate the file to the corresponding folder for which the least CPU activity is measured. I only want this measurement to be made once for each request file.

H. If the file is not removed from this folder after 5 seconds I want to copy the file to the folder with the second least sum of file sizes for .txt files, until all the folders have been tried. This cyclic moving of files then continues but at 10 second intervals and then 15 second intervals and so on. If a PC was found not to be on at stage C then they are omitted from the cycle. Once the wait reaches 1 minute I want the allocating PC to handle the request itself.


Other programs are already in place that will process the request file and run with it. I have code that will handle stages A, B, D, F. Can anyone please suggest code, modules, pitfalls or previous examples to follow for these other stages? Also, please suggest improvements to the way that I describe the process that I want.

Replies are listed 'Best First'.
Re: distributed computing
by philcrow (Priest) on Aug 30, 2005 at 18:25 UTC
    Many people do this type of work with POE. The idea is that one POE process receives requests or time polls a resource (like the file system). Then it can do what it likes, in your case handing off the work to subservient servers. It's good at networked interactions like the ones you described.

    See the perl.com articles: http://perl.com/pub/a/2004/07/02/poeintro.html and its followup http://perl.com/pub/a/2004/07/22/poe.html.

    To find file sizes, use stat. That takes time, so you might want to cache results in some way.

    Phil

Re: distributed computing
by eric256 (Parson) on Aug 30, 2005 at 20:08 UTC

    Many people handle problems differently based on there background. For interprocess communication, I like databases. After all with a good database someone has already written the server and row locking code. I'm assuming you have some staging area that files goto first. Then they are copied to specific folders. Then some process on the target machine works with that file.

    So I would store each files name and size in a database. Along with that information you could store the machine ID of the machine in charge of that file.

    Now you move the distribution burden to each process itself. This is nice because you no longer need a way for them to communication directly. Here is a model of how this would seem to work with your situation.

    FileX arrives. It is entered into the db ("FileX", 1123, 0) (where 1123 is the size and 0 is the 'owner' or process responsible. The recievers job is now complete.

    Each of your servers is then constantly running a little script that does the following. Query the sum of files all other machines are processing. If it is the lowest then attempt to change the ProcessID on the files database entry. If it succeeds then it is now the owner. If it fails them some other machine grabbed it already. UPDATE files SET ProcessID = ? WHERE id = ? AND ProcessID = 0; This is a nice operation because it only assings the file to your process if the ProcessID is still 0.

    Now the process can move the file into its folder and preced on its marry way. You could incorporate CPU load in there when its deciding if it wants to grab the file. If its over X CPU then wait Y seconds until deciding if you want to grab a file agian.

    Benefits of this are: the server is already written, the locking is already written. Storing the files sizes means you only ask a file its size once and summing folders becomes an easy (SELECT sum(file_size) GROUP BY ProcessID WHERE ProcessID <> 0). It also means you could add or remove servers at will without changing any code at all.

    I hope that might give you some insight into another way to look at the problem even if you stick with your current.


    ___________
    Eric Hodges
      I like this idea. Does anyone know, off hand, how I can get MS SQL Server to look at current activity in other MS SQL Server databases on a local network?

        You probably only want one database server. The work done by the processes is realy what you want to distribute. So you would have one machine that accepts incoming filse and logs them into the database...Then multiple processes (maybe even one ON the db machine) that check the database and do the work.


        ___________
        Eric Hodges
Re: distributed computing
by aplonis (Pilgrim) on Aug 30, 2005 at 18:52 UTC

    Sounds like a job for XML-RPC and the CPAN Frontier module. At least so that the PCs will talk to one another. One PC could run the XML-RPC client and thereby manage any number of other PCs running the XML-RPC server.

    There is an O'Reilly book on the topic. If you did it all in pure Perl, it would not even matter which of the PCs were Win32, which UNIX, etc.

    I have a somewhat overblown example of the XML-RPC way of doing things recently listed in CUFP. Mine is encrypted and does different things than you want. But you could do something similar by following the book. Or you can steal from mine and just switch out the method calls, as appropriate. I shouldn't think it would be hard. My code is very heavily commented.

Re: distributed computing
by InfiniteSilence (Curate) on Aug 30, 2005 at 18:17 UTC
    How can you have code for F without E?

    At any rate, here is an article on building a load balacing daemon in Perl (I think this code is for Perl 4 but you should be able to hack it up to snuff). My guess is there is already a ton of better stuff out there, but this was the result of a 12 second search on Google: (e.g. PERL +DAEMON +(load balancing) which you may have tried before the long post ;)

    BTW: Here's the author's updated link (add 2 seconds to my previous Google search time).

    Celebrate Intellectual Diversity

Re: distributed computing
by Roger (Parson) on Aug 31, 2005 at 17:46 UTC
    Welcome to the world of super computing! Your computers are linked in a NOW (network of workstations) I presume. Cluster computing with Perl is very easy. Let me explain it...

    The ideal configuration for a beowulf cluster is to have a head node (a node is a computer) that has access to the external network, and a private network to link up the rest of the nodes with the head node. The head node will have two network cards, one for the internal cluster, one for communicating with the rest of the world.
    C--+--C | C--+--C | C--Hub--C(head) | | ---------+----------- Backbone LAN
    This is the ideal world, I doublt that you want to configure your cluster in this mannar.

    The easier way is to connect all the computers via the back bone LAN into a NOW (network of workstations), no special hardware is required.
    C C C C(head) | | | | | | | | -+--+--+--+---- Backbone LAN
    You will need to prepare your systems as follows:

    1. have a shared storage area such as NFS mount.
    2. have all the computers mount to the same NFS share.

    The next step is to install a parallel virtual machine. Check out the project PVM at (Parallel Virtual Machine).
    You may want to set up ssh on all the machines to make PVM run better and more secure.

    Go to CPAN and fetch the module Parallel::PVM, which is the Perl interface to the parallel virtual machine.

    PVM has pretty much everything you want for parallel computing, except for, distributed shared memory (which you really don't need anyway). Your code can be written in manager-worker mode. The manager will launch workers, and the launching process is transparent to your program. PVM will decide which machine the program will run on, depending on the work load.

    By the way, the machines do not have to be of the same architecture or running the same OS. It's perfectly ok to run a cluster of linux/solaris/windows mixed nodes.

    I am not going to say more here, I hope that you find this information useful, and take joy in doing research into PVM.

      I've been thinking about this problem again and I think that I am making it much more complex than it needs to be.

      Each PC that is available to perform an operation with a flat file could dip in at intervals into a single folder (on a network) and then remove the longest present flat file (request file) to the PC in question, so preventing other PCs using it. The process of grabbing only takes place when no other flat file is being processed on that machine. The file taken would be the one that has been in the holding folder for the longest period. The time intervals that each PC attempts to grab a file could be determined by CPU activity, and this would determine the probability of each machine taking on the task. The time intervals in which the holding folder is checked could be set so that it is impossible for two of the PCs to attempt to take a file at the same time, although I have no idea how I would work that out. Can anyone help me with that? 2 or more PCs taking a file at the same time is the only problem I can see with this.
Re: distributed computing
by samizdat (Vicar) on Aug 30, 2005 at 18:20 UTC
    I don't know whether this will work with Windows, which it appears to me that you are running, but I would suggest that you start with a solution where one machine acts as a dispatcher and uses the Expect language to tell the others what to do. I would suspect that you can make expect work on Win2K, but not the rest of them. Once you get proficient with expect, you can try a few more tricks.