Beefy Boxes and Bandwidth Generously Provided by pair Networks
P is for Practical
 
PerlMonks  

Parallel Processing, Queueing and Scheduling

by submersible_toaster (Chaplain)
on Mar 11, 2003 at 23:31 UTC ( [id://242198]=perlquestion: print w/replies, xml ) Need Help??

submersible_toaster has asked for the wisdom of the Perl Monks concerning the following question:

Mellow Funks,

I hope to draw on your collective wisdom for guidance moreso than technical grit, please indulge me. I am a sysadmin/techsupport/handholder with a company that produces 2D and 3D animation/visualFX. As you might imagine one of the key process goals is to get work done faster. For me this means parallel processing for render jobs (for animators it means cracking the whip and making them stay late). But I digress.

Render Queueing Wishlist:

  • Cross-platform , Linux,IRIX,Win32
  • Templated, new task-types, their behaviour and interface can be modified by japh.
  • Server-client , the p2p idea of smedge is OK, but try finding or watching a specific log/task - no thanks.
  • implemented in perl.
I am confident that I could build the execution model (the one I have in mind) fairly quickly in perl, but as yet my skill in creating Tk interfaces is in a word - appaling. This is one reason for my interest in templating. I see no reason not to be able to describe a task by;
  • The binary used to run it
  • the source file to of interest
  • directories/paths of interest
  • ranges that can be parsed in to an arglist
  • A lexicon of "progress" watching ;to determine from the binary's STDOUT where it is up to.
  • A lexicon of "warning" watching ; determine if it is failing
Ideally these things could be packages/classes generated from a template

I am keen to learn how other monks have solved this flavor of problem, whether through purchasing software or by rolling your own. I have plied google and investigated a number of different industry-specific solutions like muster , smedge2 which have their own features and quirks, smedge2 is unattractive until it ports to linux at least. Muster is a bit too specific and pedantic. I'd love a perl solution, even more so one that I could develop myself (not necessarily TODAY, but as a long term goal). Where each of these programs has lost points is that if you want rendererX supported, you ask, and you wait. Yuck - a system of templates would be far more desirable/hackable.


many thanks -
toaster
I can't believe it's not psellchecked
  • Comment on Parallel Processing, Queueing and Scheduling

Replies are listed 'Best First'.
•Re: Parallel Processing, Queueing and Scheduling
by merlyn (Sage) on Mar 12, 2003 at 00:21 UTC
Re: Parallel Processing, Queueing and Scheduling
by perrin (Chancellor) on Mar 11, 2003 at 23:44 UTC
Re: Parallel Processing, Queueing and Scheduling
by kschwab (Vicar) on Mar 12, 2003 at 03:12 UTC
    Not really perl related, but we've had great success with openpbs, aka "Open Portable Batch Scheduler".

    • open source
    • provides a tcl ( I know, I know ) interface where you can even write your own scheduler
    • stdout/stderr watching, as you mentioned
    • an open api for reporting status back
    • dependency chain ( only run job b if job a succeeds, etc)
    I am only using it for running sql batch jobs, as I needed the job dependency chains. Would have gotten by with cron otherwise. So, since I'm not using it like you would, I'm not sure if it covers your other needs. I will note that it expects to run scripts, and not binaries ( it reads the script into stdin for later feeding to whatever is on the shebang line ). You would have to create wrapper scripts for your render jobs...

    Here's the blurb from their website:

    The Portable Batch System (PBS) is a flexible batch queueing and workload management system originally developed for NASA. It operates on networked, multi-platform UNIX environments, including heterogeneous clusters of workstations, supercomputers, and massively parallel systems. Development of PBS is provided by Altair Grid Technologies.

(jeffa) Re: Parallel Processing, Queueing and Scheduling
by jeffa (Bishop) on Mar 12, 2003 at 00:17 UTC
    So what is the question? :) Are you wanting to perform real parallel processing in Perl? If that is the case, then may i recommend C instead? You could either opt for shared memory (pthreads) or message passing (MPI). Using Perl 'threads' to provide the end user with the illusion of multi-processing is one thing, but Real™ parallel processing is best done with a tool like C, not Perl (and no, i don't even think Java cuts the mustard either ;)).

    UPDATE: ahhh, after seeing merlyn's recommendation of POE i changed my mind. Definitely give POE a try for this problem.

    jeffa

    L-LL-L--L-LL-L--L-LL-L--
    -R--R-RR-R--R-RR-R--R-RR
    B--B--B--B--B--B--B--B--
    H---H---H---H---H---H---
    (the triplet paradiddle with high-hat)
    
Re: Parallel Processing, Queueing and Scheduling
by zengargoyle (Deacon) on Mar 12, 2003 at 00:44 UTC

    maybe Overkill, and no MS-Win support (but it might work with cygwin type environment) is PBS. it's widely used in I2/Globus/Grid projects. you would still need to do your template part, but PBS can schedule/manage the jobs across available nodes. i don't use it myself, but we have a (ew, 600ish nodes, 2-4 CPUs/node) cluster that's booked for the next year or so managed by the PBS stuff.

    IRIX and Linux are on the supported list (with other *NIX flavors as well).

Re: Parallel Processing, Queueing and Scheduling
by submersible_toaster (Chaplain) on Mar 12, 2003 at 00:25 UTC

    Update:Dammit , I have think I've screwed up my terminology again and confused the issue.
    I will try to describe it better.
    A number of machines exist, running a client that offers that machine's CPU as available to use.
    On seeing an available client, the server determines which task in the queue has first crack at the CPU resource, and sends the client a segment of that task.
    The server is listening constantly to other clients, regarding their progress.
    Users have a seperate client tool to drop tasks on the queue, and monitor their progress

    More to the point , tasks are being processed by plain old executables that know nothing nor need to know - ie host A does not care that while it processes parts 1-10, Host B is processing parts 11-20 of the same task.


    I apologise again for the confusion, the -- storm has already begun :(

    Update:Firstly I would just like to say thankyou to everyone for your ideas and I will of course be following up and researching many of these suggestions. If you have not been ++'d by me in this thread yet, when the vote-fairy returns it will be so.

    Secondly, I will be taking a clue-by-four to my silly self, as my requirements are going to have to change. Supporting all those platforms is going to create more work to achieve than it will leverage in processing grunt. Powers that be are already making linux noises RE 3d Animation applications, so supporting win32 AND *n(u|i)x flavors is biting off more than I can chew. Did I point out that PHB are no prepared to buy any more hardware yet to aid processing, and are less inclined to spend money on software to manage the queue - particulary because there isn't much hardware for it to manage (it gets even more circular after that so I'll stop now).


    I am once again reminded why Perlmonks is the first page I open when I arrive at work. Thanks again.
    -toaster.


    I can't believe it's not psellchecked
      Hi,

      I used to work for a fiber optics lab at a University. We had many users vying for control of the computational cluster, and needed a way to divide time among them based on time, project priority, etc, etc.

      I ended up going with Condor. Condor will support most of what you're asking for. It will also run MPI and PVM jobs, so you can integrate parallel-processing jobs into the system. It's not difficult to setup (does require a decent sysadmin), runs on Unix and Windows, and seems to have a decent userbase. I was able to get answers from some of the developers when I emailed.

      Good luck,
      ibanix

      $ echo '$0 & $0 &' > foo; chmod a+x foo; foo;
      If the jobs are significant pieces of time, the trick that I have used for this is to store information about what jobs are needed in a database. Then let each machine open up a database connection, open a transaction, figure out which job to do, and then mark it as started. It should issue regular updates if desired. Then when it finishes it marks the job as done.

      Users have a tool that allows them to add jobs to the database.

      I didn't develop this into anything complex, but it wasn't hard to get to a usable state. And since coordination is handled in a lightly loaded database, this should scale to a very large number of machines. And it can coordinate processes that need cross-platform resources. Including human intervention!

      Other solutions worth considering are standard clustering technologies like http://www.mosix.org/, and various solutions that fall under the name grid computing.

        Openmosix contains several improvements by Moshe Bar on the original work of Prof. Barak (it starteted as fork to continue this open project as gpl) try it if you want to use a mosix type linux kernelpatch for clustering :)

      I don't know if this helps as an idea, but on a recent work-related project, I wrote a wrapper that checked a database queue for waiting tasks (in my case, account provisioning items), changed the status to pending, processed the item (either itself or through the use of a helper application), then changed it again to a completed or failed status. I had the luxury of having each wrapper only look for one type of item, though, where you would have to do a query probably based on some priority rating or something.

      Lots of good suggestions already, but good luck, and hope the idea helps.

Re: Parallel Processing, Queueing and Scheduling
by pg (Canon) on Mar 12, 2003 at 03:18 UTC
    This really depends on how heavy the processing would be, and how many parallel tasks would be there.

    Of course, as a FUNCTIONALITY, Perl can provide the paralell processing you want, and there are actually even more than one solutions, but this does not necessary to make Perl the right choice. Perl is not there for heavy processing.

    If speed is a must, and high-end paralell processing is a must, go c. Also, Perl modules intend to use more memory, which is a big drawback for parallel processing.

    If you don't want to re-develop your application later, go straight, and pick the best tool, in this case, it is c.

    Of course, you would lose all the good stuffs Perl can provide, for example regexp, rapid dev, etc., but trade off is everywhere in the IT world, so that is expected.

Re: Parallel Processing, Queueing and Scheduling
by talexb (Chancellor) on Mar 12, 2003 at 14:50 UTC

    We use Sun's Grid Engine on Red Hat 7.2 and 8. It works very well.

    --t. alex
    Life is short: get busy!
      I'll second this one. Like most offerings from Sun, It's amazing general, and the docs skimp on a few things, but it works quite well. It also has clients for several different OSes, and source code is available here at http://gridengine.sunsource.net/ --Hawson
Re: Parallel Processing, Queueing and Scheduling
by AssFace (Pilgrim) on Mar 12, 2003 at 16:15 UTC
    There are three ways that come to mind. COW (cluster of workstation - like the grid computing that one of the responses on here mentions), Beowulf type clustering, and then MOSIX/OpenMOSIX type clustering (which is sort of a mix of the two).

    As far as I know, none of those allow for cross platform.

    From what I know of parallel work - which from what I have done is graphics and financial data, but I wouldn't really say I'm an expert - you want to have your processing done in C.
    Which it sounds like you are doing - it sounds as if you want something to handle sending off data to each node on the cluster.

    People already talked of beowulf (the pvm and/or mpi stuff is used on that implementation), and they have talked of grids - but I didn't see Mosix on here.

    When I was starting up learning all of this I was interested in doing a Beowulf cluster because... well, they sound cool. But then I was starting to see a trend where for the things that I wanted to do, it was actually adding a level of complexity that wasn't needed.
    So I dumped on the Beowulf idea and went with OpenMosix.

    You can obviously read up more on your own of course, but the general idea is that you have N nodes in a cluster - all running Linux, with the kernel mods that OpenMosix requires.
    Then from there, you have a few options - the one I am more familiar with is having one head node that keeps track of what the others are up to and what loads they are under. You then can put your perl script on the head node and run it on there, and every time that you want to feed off the processor intensive part, then you fork off your call to the C program (using the Perl ForkManager module) and OpenMosix will pass that off to the node that is the least busy at that moment.
    There is also a varient that has more of the grid idea where everyone's computers run it and they are basically workstations, but can also compute stuff in the background if they are freed up.

    The basic concept of it is that they have a shared network filesystem (PFS - different than NFS, but similar too <g>). They don't have the shared memory of an SMP system, so the bottleneck tends to reside in the network speed.


    A general example for my work would be that I have my cluster, and I ssh into the head node and run the perl script, and then that spawns off the C program, passing it parameters so that it can do its thing, and then it saves out to the disk - then once the bulk processing is done, in my case the perl collects the data and makes one collective document.
    In your case it would vary depending on what rendering you are doing (if you are doing it where each node processes a pixel, then it is *very* different than if you are doing it so that each node is assigned to render out the movie frames N through N+10 and then later put those frames together into a movie).

    I'm not sure how well I explained all of that - but the shortest answer is to read up on OpenMosix and it is likely that someone out there is doing something very similar to what you are wanting to do.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://242198]
Approved by diotalevi
Front-paged by diotalevi
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others having an uproarious good time at the Monastery: (3)
As of 2024-04-24 22:15 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found