Beefy Boxes and Bandwidth Generously Provided by pair Networks
Pathologically Eclectic Rubbish Lister
 
PerlMonks  

Clusters, Distributed Computing, and Perl

by enigmae (Pilgrim)
on Oct 13, 2002 at 09:41 UTC ( [id://204877]=perlquestion: print w/replies, xml ) Need Help??

enigmae has asked for the wisdom of the Perl Monks concerning the following question:

Greetings,
I was looking around CPAN, and perlmonks and have not noticed anyone doing a project that used cluster technology. My goal is to write a series of perl scripts that i could run on any machines running perl and then provide load-balancing, parallel threads, and common management from a control system.
I have noticed some good modules on using threads (parallel) and some on distributed computing (POE) but there doesn't seem to be one solid module that performs these tasks. I was essentially going to use a parallel algorithm for finding prime numbers since this has been a proven essential task for clusters and supercomputers.
Has anyone setup a multi-threaded script that can perform cluster-like functionality on heterogenous network environments? Does perl run on current cluster technology (Beowolf or other) and is there a benefit to using perl and/or Beowolf for this task? What I had in mind was to use perl to create a powerful application framework that can provide a mechanism to easily multi-task and solve problems using several machines and be easy to intergrate in existing and future projects.
Thanks for your time,
Enigmae
  • Comment on Clusters, Distributed Computing, and Perl

Replies are listed 'Best First'.
Re: Clusters, Distributed Computing, and Perl
by jj808 (Hermit) on Oct 13, 2002 at 14:19 UTC
Re: Clusters, Distributed Computing, and Perl
by rdfield (Priest) on Oct 13, 2002 at 14:02 UTC
    Have you had a look at P5EE? It's in the formative stages and I'm sure Stephen et al would more than welcome you to the fold.

    rdfield

Re: Clusters, Distributed Computing, and Perl
by rcaputo (Chaplain) on Oct 13, 2002 at 22:30 UTC

    It's possible to build parallelism atop POE through fork() (see POE::Wheel::Run). POE doesn't support Perl threads yet because of an outstanding bug in 5.8.0 (see ticket 15654 at http://bugs6.perl.org/).

Re: Clusters, Distributed Computing, and Perl
by true (Pilgrim) on Oct 14, 2002 at 05:48 UTC
    I've been overtasking for years and have learned the power of using a cron brain, a looped perl socket-server (for task dispatcher), and a bunch of child nodes waiting for microtasks. The child nodes are spread across your cluster.

    In my grossly anti-knowledged opinion, it is identical to the fancy Message Maping Protocal Beowolf clusters rely on. Cept you'd have to set your cluster preferences yourself.

    Sockets rock! You can define your own protocals for speed and memory management. My only beef with them is killing them can get tricky. But only b/c i'm green.
    hope dis helps,
      Yeah, Net::IRC and Proc::Processtable rock... (Used these for 'stress testing' stuff...) Generated interesting test results to show how (in)flexible some software was...

      ----
      Zak
      Pluralitas non est ponenda sine neccesitate - mysql's philosphy
Re: Clusters, Distributed Computing, and Perl
by Anonymous Monk on Oct 13, 2002 at 15:52 UTC
    I am sorry to be a wet blanket, but why are you doing this work? Projects done in a vacuum because its author thinks it would be cool to work on generally don't go anywhere useful. Particularly when there are multiple other projects that do the same thing already.

    If you have a real computational problem to solve, then install Mosix, write your logic using fork when you can, and call it a day. If you want failover, write it as a web application, avoid silly things like shared memory, and put a load balancer in front of it. If your problem is embarrassingly parallel, but you need different machines to co-operate, don't bother with an official cluster. Instead communicate through a database, and keep a current status table of jobs that need to be done, have been started, etc.

    Each of these has been called a cluster, they each solve different problems, and all are usable right now in Perl.

      Projects done in a vacuum because its author thinks it would be cool to work on generally don't go anywhere useful. Particularly when there are multiple other projects that do the same thing already.

      Just because an endeavour doesn't benefit "Mankind" doesn't mean it doesn't benefit the Man undertaking the endeavour -- And an experiment conducted over and over by thousands of people in the past is still worth conducting one more time by someone who is not one of those thousands of people.

      It's called education.

        The person doing the experiment again usually gets that benefit more efficiently if they start by learning what the people who already did the experiment thought they learned.

        Which is why the education system doesn't first teach you about gravity by giving you two cannonballs and pointing you at a likely tower.

        This is not to say that there aren't good reasons to have people do things again. You aquire useful skills. The knowledge sticks better. It may be entertaining. It gives an appreciation for what the research entails. And the first few times it is necessary to show that the experiment is reproducible and doesn't depend on hidden factors that the researcher didn't see.

        In this case the benefits are rather one-sided. Even if he eventually writes his own clustering sytem, he will learn more if he first studies what people already know about the topic.

Re: Clusters, Distributed Computing, and Perl
by Abigail-II (Bishop) on Oct 14, 2002 at 11:46 UTC
    I've worked with SunCluster, MC ServiceGuard and Veritas Clusters. All three allow/require scripting - and that can be done in Perl as easily as in any other language. Some of the Veritas software is actually written in Perl. I remember doing a Veritas training and we wanted to do something that wasn't supported (a postshutdown hook I think). I impressed the trainer by hacking the Veritas software and adding the feature we wanted - just 2 lines of Perl.

    Abigail

Re: Clusters, Distributed Computing, and Perl
by bugsbunny (Scribe) on Oct 14, 2002 at 13:10 UTC
    Look at :
    http://openmosix.sourceforge.net/

    <snip>
    Generally, how do I write an openMosix-aware program?
    Write your programs as you normally would. Any processes that you spawn are candidates for migration to another node.
    Can I write openMosix programs in perl?
    Yes. Use the Parallel::ForkManager? available from CPAN or directly from http://www.cpan.org/authors/id/D/DL/DLUX/Parallel-ForkManager-x.x.x.tar.gz.
    </snip>
Re: Clusters, Distributed Computing, and Perl
by elwarren (Priest) on Oct 14, 2002 at 14:08 UTC
    I've been thinking about writing something similar to replace a hack I wrote. Right now I have a client that opens a socket, waits for input, then executes it in an eval and feeds the results back down the socket. The first line is a username/password for security. It doesn't spawn, so if it's already running a job it cannot accept another. Like I said, it's an ugly hack, but it did the job at the time.

    I've been wanting to rewrite this as something more robust, but have been waiting for somebody else to start a base that I could extend to fit my need. The closest thing I've seen so far has been Win32::Procfarm (which I thought had a unix counterpart, but I can't seem to find it on cpan this morning...)

    There's my contribution :-) I'll be watching this thread with bated breath.

    Oh, there is also the Net::Distributed module that was posted here, if you'd like to take the anarchist's cluster approach ;^} Update: Link http://www.perlmonks.org/index.pl?node_id=142837
Re: Clusters, Distributed Computing, and Perl
by enigmae (Pilgrim) on Oct 14, 2002 at 16:25 UTC
    Thanks for all your responses!

    It looks like I have some experimenting to do, but this has been very educational. I wasn't aware there were so many ways and methods for distributed and cluster computing. As I learn I am sure this community will have some good suggestions and stories. I hope others out there have found some new facilities through this post.

    My orgins of distributed computing was through Microsoft's DCOM (cough cough) but that was a hassle corrected through this newer dot net remoting and web services technology but still gets clumsy. I still think CGI is a great technology which I still use, but these XML webservices and ASP-like web interface technologies keep developing while it isn't clear what problem they are solving. I have heard about Sun's Grid computing which nobody mentioned, has anyone heard or had experience with this? If I remember correctly it wasn't free but quite expensive.
    Thanks for all your time,
    Enigmae

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://204877]
Approved by blakem
Front-paged by Aristotle
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others musing on the Monastery: (3)
As of 2024-04-20 02:44 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found