Re^2: Running Parallel Perl Script on Linux Cluster
by sh1tn (Priest) on Mar 31, 2006 at 18:34 UTC
|
END { open my $fh, ">", "done$param" or die $! }
instead of
END {
system("touch done$param");
};
?
| [reply] [d/l] |
Re^2: Running Parallel Perl Script on Linux Cluster
by monkfan (Curate) on Apr 01, 2006 at 10:30 UTC
|
Hi monkey_boy,
Thanks a lot for your answer.
In addition to shltn's question below, which you haven't answered. You mean creating "done$param$ as a dummy file? Note sure why you would "touch" the file..
I have other following questions:
| [reply] [d/l] [select] |
|
Hi, this is monkey_boy (not logged in, as im at home),
- the "touch" is to create an empty dummy file, seperate from your results file.
The reasoning is that your results files will be created at the start of the processing, so your master script cannot check for the existance as proof of completion.
-
The END block always get executed at just before a script terminates, so its as good a place as any to "touch" the file.
-
@done_list is left for you to code, its simple perl, you have a list of jobs somewhere, convert them with a regex into a list of "done" files.
-
In reply to shltn's question, his way is probably better, as you'll hopefully get an error on failure (but given my experience with sun grid engine this is not always the case ;))
Hope this is helpfull.
monkey_boy
| [reply] |
|
@done_list is left for you to code, its simple perl, you have a list of jobs somewhere, convert them with a regex into a list of "done" files.
Hi monkey_boy,
Just to double confirm, @done_list will contain list of dummy files created in END block right? Or is it a list of actual output files? Sorry I'm a bit slow here....
| [reply] |
|
Re^2: Running Parallel Perl Script on Linux Cluster
by salva (Canon) on Apr 01, 2006 at 18:14 UTC
|
a similar result can be attained changing the shell wrapper instead of the perl script that runs on the cluster nodes:
#!/usr/bin/bash
cd ~/some_dir
(perl mycode.pl $1 $2 > ~/some_out_dir/param_set$1.out.tmp
&& mv ~/some_out_dir/param_set$1.out)
|| touch ~/some_out_dir/param_set$1.fail
then, on the master you will have to poll from time to time to see if all the result files exist or if "fail" files are there, to requeue the jobs.
Though, it should be better ways to synchronize partial jobs over the cluster. | [reply] [d/l] |
|
...you will have to poll from time to time to see if all the result files exist or if "fail" files are there, to requeue the jobs.
Hi salva,
Thanks for the answer.
Roughly, I suppose that one would use glob to check the file (fail or not)? But not sure how to 'reqeue' the jobs.
Can you give a simple example how would one do the above step as you suggested?
| [reply] |
|
| [reply] [d/l] [select] |
|
Hi monkfan,
Can you share how the code worked? I'd like to try out a variation on a 32 proc cluster in a classroom environment.
Thanks!
vanallp
| [reply] |