The State of Parallel Computing in perl 2007?

jettero has asked for the wisdom of the Perl Monks concerning the following question:

Replies are listed 'Best First'.
Re: The State of Parallel Computing in perl 2007? by diotalevi (Canon) on Jan 21, 2007 at 18:28 UTC
I've recently been learning Mozart/Oz. It has this kind of stuff built in. Here's a tutorial on distributed computing. ⠤⠤ ⠙⠊⠕⠞⠁⠇⠑⠧⠊	[reply]
Re^2: The State of Parallel Computing in perl 2007? by BrowserUk (Patriarch) on Jan 21, 2007 at 19:15 UTC
If your interest in Mozart/Oz is due to it's distributed and parallel computing aspects, you might also find Erlang interesting if you haven't already encountered it. I find the Erlang cui repl preferable to the Oz emacs-based interface, but if you like emacs that will be less of a consideration. Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error. "Science is about questioning the status quo. Questioning authority". In the absence of evidence, opinion is indistinguishable from prejudice. "Too many [] have been sedated by an oppressive environment of political correctness and risk aversion."	[reply]
Re^3: The State of Parallel Computing in perl 2007? by jettero (Monsignor) on Jan 21, 2007 at 19:42 UTC
Tying the language to an editor has almost completely ruled it out for me. What an odd thing to do. It does still sound pretty interesting though. I'm installing erlang presently. I've heard people mention it before. I feel like I'm taking away a "perl doesn't really have much of this yet" feeling from the two posts above this though. Is that the case? -Paul	[reply]
Re^4: The State of Parallel Computing in perl 2007? by diotalevi (Canon) on Jan 21, 2007 at 20:13 UTC
Re^3: The State of Parallel Computing in perl 2007? by diotalevi (Canon) on Jan 22, 2007 at 07:17 UTC
An attractive talk was just posted to Lambda the Ultimate about Erlang and concurrency: LCA2007: Concurrency and Erlang. ⠤⠤ ⠙⠊⠕⠞⠁⠇⠑⠧⠊	[reply]
Re^4: The State of Parallel Computing in perl 2007? by BrowserUk (Patriarch) on Jan 22, 2007 at 16:55 UTC
Re^5: The State of Parallel Computing in perl 2007? by diotalevi (Canon) on Jan 22, 2007 at 18:28 UTC
Re^3: The State of Parallel Computing in perl 2007? by diotalevi (Canon) on Jan 21, 2007 at 20:08 UTC
No, my interest in Mozart comes from its integrated constraint solvers. The distributed computation stuff is just gravy. ⠤⠤ ⠙⠊⠕⠞⠁⠇⠑⠧⠊	[reply]
Re: The State of Parallel Computing in perl 2007? by zentara (Cardinal) on Jan 21, 2007 at 17:04 UTC
Are you talking about multiple processors working on a single problem, or just processes running in parallel, sharing data somehow? It's a big topic, and you need to narrow down what Parallel means. I'm not really a human, but I play one on earth. Cogito ergo sum a bum	[reply]
Re: The State of Parallel Computing in perl 2007? by diotalevi (Canon) on Jan 21, 2007 at 23:33 UTC
Allison recently wrote a PDD (parrot design doc) on concurrency: http://www.parrotcode.org/docs/pdd/pdd25_concurrency.html. I recall she was also looking at things like the IO language while writing this. ⠤⠤ ⠙⠊⠕⠞⠁⠇⠑⠧⠊	[reply]
Re: The State of Parallel Computing in perl 2007? by toma (Vicar) on Jan 22, 2007 at 07:42 UTC
I don't know if you would consider it parallel programming or not, but I use memcached to get more than one computer into the act. There are several perl modules that use it. Unlike many things in parallel programming, memcached is easy. POE::Wheel::Run will allow you to use multiple processes, which should provide parallelism on a multiprocessor machine. To use it in windows, I used Cygwin since POE::Wheel::Run required the more full-featured fork/exec. Another easy way to do parallel computing is to use a web server. One program can make requests to multiple servers, or a single server with multiple CPUs, and get them all working on parts of a problem. You can use POE to control the flow of making multiple web requests and synching back up when they return. It should work perfectly the first time! - toma	[reply]
Re^2: The State of Parallel Computing in perl 2007? by jettero (Monsignor) on Jan 22, 2007 at 11:58 UTC
I saw in the POE docs that there were references to needing a serializer. I didn't get far enough with it to see that you can fork. Does it support load balancing and things? I've got the POE::WHeel::Run docs up presently, and I'm seeing that it forks a child process, but I thought it was for things like spawning 'cat' or 'ls' or whatever... That probably isn't what I have in mind, but I do still wonder if POE has some kind of built in shared memory multi-processor and/or multi-computer features. It seems like it should. -Paul	[reply]
Re^3: The State of Parallel Computing in perl 2007? by toma (Vicar) on Jan 22, 2007 at 16:52 UTC
I use POE::Wheel::Run to spawn four Perl programs. A web server built from HTTP::Daemon. This provides a browser-based GUI. This web server also spawns programs that can get content from other web servers. A live link to a large CAD program. A live link to a large circuit simulator. A terminal that provides user messages and a command line, for development and for cases where the GUI doesn't have deep enough functionality. I hadn't thought about this program as parallel processing until I saw your question. I have only recently begun running the application on multi-cpu machines. The program uses message passing through several mechanisms: STDIN, STDOUT, and STDERR of child processes. Dropping files. The CAD package uses this for input. Web calls. I recently switched from LWP to curl because of deployment difficulties. I had trouble getting my installer to automate the configuration of LWP. Environment variables are used to send parameters into child programs. I had trouble with platform differences in the handling of command-line arguments. This is possibly due to differences in quoting and escaping. It should work perfectly the first time! - toma	[reply]
Re^2: The State of Parallel Computing in perl 2007? by perrin (Chancellor) on Jan 23, 2007 at 18:55 UTC
You use memcached as a database? Not a good idea. Use a database for that. Memcached doesn't consider it a problem to drop your data silently if it runs out of free RAM. That's typical for a cache.	[reply]
Re^3: The State of Parallel Computing in perl 2007? by toma (Vicar) on Jan 24, 2007 at 09:31 UTC
No, I don't think I mentioned using memcached as a database. Memcached allows me to use multiple machines to cache data in a transparent manner. It allows me to use more, cheaper machines rather than one big expensive machine. I have heard people describe this tactic as "build out, not up." A common use case is to cache the results of an SQL query. I use the query as the cache key and the value is the result set from the query. I check the cache to see if the dataset is there. If it is, I get it from the cache. If it isn't, I run the SQL and put the results in the cache. This provides me with a huge speedup. If the data in the database gets updated, I flush the cache and start again. This is not a problem for parts of my application, so those are the parts where I use memcached. Instead of storing lot of data in a perl data structure in mod_perl program and counting on the copy-on-write mechanism to save RAM, I use memcached. It should work perfectly the first time! - toma	[reply]
Re^4: The State of Parallel Computing in perl 2007? by perrin (Chancellor) on Jan 24, 2007 at 14:54 UTC
Re^3: The State of Parallel Computing in perl 2007? by jettero (Monsignor) on Jan 25, 2007 at 22:14 UTC
memcached is just about the coolest thing ever btw. It's on the top of my list of things to learn next. UPDATE: oops, wrong parent. Eh. -Paul	[reply]
Re: The State of Parallel Computing in perl 2007? by moklevat (Priest) on Jan 22, 2007 at 15:35 UTC
Hi jettero, I occasioanlly work on "trivially" or "embarassingly" parallelizable problems. For me this most often involves computing a single statistic from a dataset over a large combination of parameters. The most efficient solution for me has been to use R with the Rmpi package to interface with MPI. Depending on the scope of the task, I may also use MySQL for distributing the dataset and collecting the results using RMySQL. I could see doing the same thing in perl with PDL, Parallel::MPI, and your favorite database.	[reply]
Re^2: The State of Parallel Computing in perl 2007? by jettero (Monsignor) on Jan 22, 2007 at 19:29 UTC
I clicked through to ::MPI a little, but the low version number and update from 1999 kinda scare me off. I have looked at PDL enough to wish I had columns of numbers to process. -Paul	[reply]
Re^3: The State of Parallel Computing in perl 2007? by moklevat (Priest) on Jan 22, 2007 at 20:57 UTC
I initially had to choose between PVM and MPI, and I ended up using MPI only because that was the first thing I tried and it happened to work for me. From what I had read at the time, PVM should work just as well as MPI for trivially parallelizable tasks. I would not guess that the MPI module is so trivial that it did not warrant any changes, but it does look like the PVM module has seen more development activity.	[reply]
Re: The State of Parallel Computing in perl 2007? by markatwork (Initiate) on Jan 22, 2007 at 12:07 UTC
From the realms of "I saw this once and I thought it looked interesting", rather than being anything I've actually used, WSRF::Lite at http://www.sve.man.ac.uk/Research/AtoZ/ILCT looks interesting. Googling for 'perl grid' brings back a few links that seem to concern the areas you're looking at. Regards Mark	[reply]
Re: The State of Parallel Computing in perl 2007? by erix (Prior) on Feb 18, 2007 at 17:11 UTC
As a multi-computer scenario, Condor might be interesting for you. Condor lets you submit a program/batchfile/shellscript to a queue of many machines (nodes). Every one of these nodes needs to have a condor client installed. The condor client advertises the resources that that particular machine has on offer. This information is then used to match your job requirements to any number of machines. Advertised attributes are things like: CPU-type, OS-type, Amount of memory, free disk space, etc. If enough clients are available, your jobs will run simultaneously. Condor can use dedicated machines, or take advantage of idle clients: running only on designated times (at night, for instance) or monitoring machine activity, and kicking in after some idle period. Obviously, because clients need to be installed on all machines, it needs some organisation (=politics) to get authorization to run your programs on a sizable group of machines.	[reply]
Re: The State of Parallel Computing in perl 2007? by casiano (Pilgrim) on May 22, 2008 at 12:35 UTC
If you have several UNIX platforms with Perl installed and SSH access, then you can use GRID::Machine to have Perl interpreters running in those nodes and make them collaborate. The best thing being that you don't have to ask administrators to install any additional software. I have written a tutorial (GRID::Machine::perlparintro) that through a simple example introduces how to use Perl via GRID::Machine to exploit the computing power of idle workstations. Hope it Helps Casiano	[reply]