Re: Runing "regular" code with threaded perl
by Corion (Patriarch) on Jul 10, 2008 at 06:57 UTC
|
The problem is not with threading. If your program is written to use threads, you won't be able to run it on a version Perl without threads. If your program is not written to use threads, Perl won't use threads and all the concurrency problems you have come from other sources.
It seems to me that your program is launching multiple external programs but is not properly synchronizing them. Without seeing the relevant code, it's hard to tell where your program goes wrong. I recommend looking at Parallell::ForkManager or at the simple runN by Dominus. Both approaches are discussed in Parallelization of heterogenous (runs itself Fortran executables) code.
If you want/need to roll your own parallelisation, I recommend having one or more "queues" into which you put the jobs. Your master program then launches the subprograms to process the jobs in the queues and hopefully has simple enough logic to determine when a job in a queue further down below can be started.
| [reply] [d/l] |
|
"If your program is not written to use threads, Perl won't use threads and all the concurrency problems you have come from other sources."
What does this mean? Indeed, my program "was not written to use threads" in that is was developed and tested on a single processor machine. Yet the same code fails on a multi-processor machine with threading enabled. The actual code is simple: I make a system call to a routine that outputs a file, then reads the resulting file. The perl script is trying to access the file before the called routine is finished making it (if I load the code in a debugger and step through line by line it runs just fine). How does this translate to "perl not using threads"? Are you suggesting that it's a problem with the OS managing threads?
| [reply] |
|
What Corion means by "was written to use threads" is simply: does your script contain the statement:
use threads;
If not, the threading features of perl will not be used.
| [reply] [d/l] |
Re: Runing "regular" code with threaded perl
by ikegami (Patriarch) on Jul 10, 2008 at 07:22 UTC
|
Whatever your problem is, threading support has nothing to do with it. It simply allows you to use use threads to create threads. It has no affect on how system works. If you don't create threads, the only difference with using a threaded build is a performance penalty.
for instance, a system() call creates a file, but the code tries to access the file before the system call is done making it
system won't return until the child exists and therefore after the child is done writing. Processes that don't exist can't write to files, so what you say makes no sense unless the child spawned a detached child and this detached process is actually doing the writing.
| [reply] [d/l] [select] |
|
| [reply] |
|
I do not "use threads" in my code. Whether it makes sense or not (it certainly doesn't to me), the fact is that this works:
system("xspec - ${tmp}.xcm");
open(DAT, "${tmp}xsfit.dat") || die ("Could not open file!");
whereas this does not:
bjmsys("xspec - ${tmp}.xcm", $v);
open(DAT, "${tmp}xsfit.dat") || die ("Could not open file!");
where------------
sub bjmsys {
my $arg=shift;
my $v=shift;
$v=0 unless defined $v;
my $status;
print "$arg\n" if $v>1||$v<0;
$status=system("$arg");
print "return status = $status\n" if $v>1;
return $status;
}
| [reply] |
|
Re: Runing "regular" code with threaded perl
by zentara (Cardinal) on Jul 10, 2008 at 11:34 UTC
|
I have a shiny new desktop with two quad processors.As others have said, you are under a misconception about threads. Just because you have 2 processors, dosn't mean that all programs will automatically be run in some sort of shared-cpu manner. For that matter, even if you "use threads" in a Perl script, your kernel may decide to run all threads with one cpu. Multi-threading as you envision it, is done in the kernel, and it takes specialized c programs to utilize it to it's full potential. Your single or multi-threaded Perl program is at the mercy of the kernel, when it runs.
Google for "linux multi cpu scheduling" and "linux multi cpu scheduling Perl" and you will see what is happening. If you are not an expert programmer( as is the case with most of us), you are jumping into very deep water, and you will conclude that it isn't worth the time learning to override the kernel's design, unless you are doing some extremely intensive number crunching on a super-computer. Is saving a few milliseconds of execution time worth the many hours of learning required to force dual-cpu usage?
| [reply] |
|
On the contrary, I expected my perl scripts to run just as they did on my single-processor box on my multicore system, and that I'd have the option to add threads to them down the road if I so choose. I was confused when perl seemed to automatically be doing some kind of internal threaded, but as it turns out, it was an unrelated problem (see my post above). I'm happy to see that this works the way I would expect it to. Although I don't consider myself a programmer, I have experience writing massively parallel C code for cluster/supercomputers, so the concept of threading is not entirely foreign to me.
| [reply] |
Re: Runing "regular" code with threaded perl
by cdarke (Prior) on Jul 10, 2008 at 08:20 UTC
|
OK, lets look at other things that may have changed. The xspec application is an obvious one - does it fire off asynchronous tasks that might still be running after the "main" program completes? I know this is a horrible ugly hack, but try putting a sleep 4 after the system call before trying to open the file - this is for testing only, you would not want to leave the sleep there. It may expose a timing issue.
Also consider if you have the same version of xspec running on both machines, they may behave differently. | [reply] [d/l] |
|
OK, this was a helpful comment. Apparently, xspec is silently dying, but only sometimes. So, if I stepped through the debugger, or called xspec outside of my perl script, seemed to run fine, but this test with the sleep command shows me that it sometimes randomly dies (I can make the same call over and over again at the command line, and most of the time it works, but sometimes it fails). So, as suggested, my blaming of threaded perl was misplaced. Xpsec is a pretty standard and well-tested piece of astrophysical data analysis software, so figuring out why it's dying on my new system will be another (non-perl related) problem.
| [reply] |
|
I'll bet you anything that the problem you are having with Xpsec is what you suspected perl of doing.
If it works most of the time but only sometimes fails then the most likely problem is that it has troubles with multiple threads and fast CPU's that might actually be able to complete their tasks so fat that the original programmer never expected this to be possible and thus didn't bother to check for this.
One good option would be to find a mailing list for: Xspec and ask if others find the same issues, check to see if there is a new release etc.
For the time being try to do the following in your code:
until ( -e "${tmp}xsfit.dat" ) {
bjmsys("xspec - ${tmp}.xcm", $v);
}
open(DAT, "${tmp}xsfit.dat") || die ("Could not open file!");
my @kT=<DAT>; close DAT;
This will basically make sure that the file exists and only then continue if the file does not exist it will simply try to create is again, and again until the file is there. It is a very crude way of working but it will certainly do the trick.
As long as Xspec does not take for ever and ever before it fails you should be saved until the Xspec problem is resolved. | [reply] [d/l] |
Re: Runing "regular" code with threaded perl
by pc88mxer (Vicar) on Jul 10, 2008 at 07:08 UTC
|
Can you give us an example of a script which doesn't work on your new box but used to work on your old one?
Is it possible that your new box is a multicore system, and now things are happening concurrently whereas before they were being performed sequentially? In any case, we'd probably need to see some code in order to tell you what's going on.
| [reply] |
|
Yes, that's what happening, the instructions aren't being carried out sequentially. It is a multi-core system, but why does this matter? If I run a c program it runs as a single thread, it seems to have something to do with how perl is running the script.
I don't know if it will be much help, but here's the explicit code fragment that fails:
bjmsys("xspec - ${tmp}.xcm", $v);
open(DAT, "${tmp}xsfit.dat") || die ("Could not open file!");
my @kT=<DAT>;
close DAT;
It dies when it tries to open the file because it hasn't been created yet.
| [reply] |
|
bjmsys("xspec - ${tmp}.xcm", $v);
Now it would be interesting to know what this mysterious sub bjmsys does.
I somehow suspect that it launches an external application in the background, while it really should just launch it and wait for it to finish.
| [reply] [d/l] [select] |
|
|
| [reply] [d/l] [select] |
|
If it's a timing issue, you could just keep trying. Something like
bjmsys("xspec - ${tmp}.xcm", $v);
my $tries = 10;
until ( open(DAT, "${tmp}xsfit.dat") || --$tries <= 0 ) { sleep 1; }
die ("Could not open file: $!") if ($tries <= 0);
my @kT=<DAT>; close DAT;
Not quick, but dirty :) | [reply] [d/l] |
Re: Runing "regular" code with threaded perl
by djp (Hermit) on Jul 11, 2008 at 01:38 UTC
|
Your problem is you're not checking the return value of bjmsys(). Try:
#bjmsys("xspec - ${tmp}.xcm", $v);
bjmsys("xspec - ${tmp}.xcm", $v) == 0 or die ("bjmsys failed");
open(DAT, "${tmp}xsfit.dat") || die ("Could not open file!");
BTW, for a scientist, your analysis of the problem was decidedly unscientific! :-).
| [reply] [d/l] |
|
The return value from system is checked in the bjmsys subroutine depending on the value of the second parameter, $v. Can kingskot confirm that $v is being set to 1, at least for testing?
| [reply] |
Re: Runing "regular" code with threaded perl
by NolanPL (Novice) on Jul 10, 2008 at 15:14 UTC
|
As everybdy has said, your problem is not threading.
An easy solution is to just sleep your program for as long as it takes for the file to be created.
just something like
sleep(3); while the file is being created where 3 is however long it takes will fix the problem, although its not great coding its an easy simple solution. | [reply] |
|
Sleep is a bad hack that fails the moment the system runs what the program is waiting on slower. Please don't encourage sleep. The proper way to do it is to test for some condition.
| [reply] |