Re: The State of Parallel Computing in perl 2007?
by diotalevi (Canon) on Jan 21, 2007 at 18:28 UTC
|
| [reply] |
|
|
If your interest in Mozart/Oz is due to it's distributed and parallel computing aspects, you might also find Erlang interesting if you haven't already encountered it.
I find the Erlang cui repl preferable to the Oz emacs-based interface, but if you like emacs that will be less of a consideration.
Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.
| [reply] |
|
|
Tying the language to an editor has almost completely ruled it out for me. What an odd thing to do. It does still sound pretty interesting though. I'm installing erlang presently. I've heard people mention it before.
I feel like I'm taking away a "perl doesn't really have much of this yet" feeling from the two posts above this though. Is that the case?
| [reply] |
|
|
|
|
| [reply] |
|
|
|
|
|
|
| [reply] |
Re: The State of Parallel Computing in perl 2007?
by zentara (Cardinal) on Jan 21, 2007 at 17:04 UTC
|
Are you talking about multiple processors working on a single problem, or just processes running in parallel, sharing data somehow? It's a big topic, and you need to narrow down what Parallel means.
| [reply] |
Re: The State of Parallel Computing in perl 2007?
by diotalevi (Canon) on Jan 21, 2007 at 23:33 UTC
|
| [reply] |
Re: The State of Parallel Computing in perl 2007?
by toma (Vicar) on Jan 22, 2007 at 07:42 UTC
|
I don't know if you would consider it parallel programming or not, but I use memcached to get more than one computer into the act. There are several perl modules that use it.
Unlike many things in parallel programming, memcached is easy.
POE::Wheel::Run will allow you to use multiple processes, which should provide parallelism on a multiprocessor machine. To use it in windows, I used Cygwin since POE::Wheel::Run required the more full-featured fork/exec.
Another easy way to do parallel computing is to use a web server. One program can make requests to multiple servers, or a single server with multiple CPUs, and get them all working on parts of a problem. You can use POE to control the flow of making multiple web requests and synching back up when they return.
It should work perfectly the first time! - toma
| [reply] |
|
|
I saw in the POE docs that there were references to needing a serializer. I didn't get far enough with it to see that you can fork. Does it support load balancing and things? I've got the POE::WHeel::Run docs up presently, and I'm seeing that it forks a child process, but I thought it was for things like spawning 'cat' or 'ls' or whatever...
That probably isn't what I have in mind, but I do still wonder if POE has some kind of built in shared memory multi-processor and/or multi-computer features. It seems like it should.
| [reply] |
|
|
| [reply] |
|
|
You use memcached as a database? Not a good idea. Use a database for that. Memcached doesn't consider it a problem to drop your data silently if it runs out of free RAM. That's typical for a cache.
| [reply] |
|
|
No, I don't think I mentioned using memcached as a database. Memcached allows me to use multiple machines to cache data in a transparent manner. It allows me to use more, cheaper machines rather than one big expensive machine. I have heard people describe this tactic as "build out, not up."
A common use case is to cache the results of an SQL query. I use the query as the cache key and the value is the result set from the query. I check the cache to see if the dataset is there. If it is, I get it from the cache. If it isn't, I run the SQL and put the results in the cache. This provides me with a huge speedup.
If the data in the database gets updated, I flush the cache and start again. This is not a problem for parts of my application, so those are the parts where I use memcached. Instead of storing lot of data in a perl data structure in mod_perl program and counting on the copy-on-write mechanism to save RAM, I use memcached.
It should work perfectly the first time! - toma
| [reply] |
|
|
|
|
| [reply] |
Re: The State of Parallel Computing in perl 2007?
by moklevat (Priest) on Jan 22, 2007 at 15:35 UTC
|
Hi jettero,
I occasioanlly work on "trivially" or "embarassingly" parallelizable problems. For me this most often involves computing a single statistic from a dataset over a large combination of parameters. The most efficient solution for me has been to use R with the Rmpi package to interface with MPI. Depending on the scope of the task, I may also use MySQL for distributing the dataset and collecting the results using RMySQL. I could see doing the same thing in perl with PDL, Parallel::MPI, and your favorite database. | [reply] |
|
|
I clicked through to ::MPI a little, but the low version number and update from 1999 kinda scare me off. I have looked at PDL enough to wish I had columns of numbers to process.
| [reply] |
|
|
I initially had to choose between PVM and MPI, and I ended up using MPI only because that was the first thing I tried and it happened to work for me. From what I had read at the time, PVM should work just as well as MPI for trivially parallelizable tasks. I would not guess that the MPI module is so trivial that it did not warrant any changes, but it does look like the PVM module has seen more development activity.
| [reply] |
Re: The State of Parallel Computing in perl 2007?
by markatwork (Initiate) on Jan 22, 2007 at 12:07 UTC
|
From the realms of "I saw this once and I thought it looked interesting", rather than being anything I've actually used,
WSRF::Lite at http://www.sve.man.ac.uk/Research/AtoZ/ILCT looks interesting.
Googling for 'perl grid' brings back a few links that seem to concern the areas you're looking at.
Regards
Mark | [reply] |
Re: The State of Parallel Computing in perl 2007?
by erix (Prior) on Feb 18, 2007 at 17:11 UTC
|
As a multi-computer scenario, Condor might be interesting for you.
Condor lets you submit a program/batchfile/shellscript to a queue of many machines (nodes). Every one of these nodes needs to have a condor client installed. The condor client advertises the resources that that particular machine has on offer. This information is then used to match your job requirements to any number of machines. Advertised attributes are things like: CPU-type, OS-type, Amount of memory, free disk space, etc. If enough clients are available, your jobs will run simultaneously.
Condor can use dedicated machines, or take advantage of idle clients: running only on designated times (at night, for instance) or monitoring machine activity, and kicking in after some idle period.
Obviously, because clients need to be installed on all machines, it needs some organisation (=politics) to get authorization to run your programs on a sizable group of machines.
| [reply] |
Re: The State of Parallel Computing in perl 2007?
by casiano (Pilgrim) on May 22, 2008 at 12:35 UTC
|
If you have several UNIX platforms with Perl installed and SSH access, then you can use GRID::Machine
to have Perl interpreters running in those nodes and make them collaborate. The best thing being that you don't have to ask administrators to install any additional software.
I have written a tutorial (GRID::Machine::perlparintro)
that through a simple example introduces how to use Perl via GRID::Machine
to exploit the computing power of idle workstations.
Hope it Helps
Casiano
| [reply] |