Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:
Hello hello,
I have a perl script that I need to run a large number of times on different input data. I use Grid Cluster to send all the processes to our compute cluster.
All the processes are supposed to write to the same output files, of which there are four. However, in my first attempts the data was jumbled. I would get things like:
line1 data data data line2 data data dataline3 data data data line4 data data data line5 data data data line6 dataline7 data data data data line8 data data data data
I took this to mean that sometimes one process didn't quite finish printing when another process tried writing to the file, so I figured I was going to have to lock the output file before writing to it.
I read several tutorials, among which File locking, but it still doesn't work. Below are simplified versions of my scripts. Can someone help me?
use strict; use warnings; use GRID::Cluster; my $script = "script.plx"; my @processes = (...); my @machines = (...); my %max_num_processes = (...); my $cluster = GRID::Cluster->new( host_names => \@machines, max_num_np => \%max_num_processes, ); $cluster->qx(@processes);
use strict; use warnings; use Fcntl qw(:flock); my @letters = qw(a b c d); foreach my $letter (@letters) { my $output_file = "output_" . $letter . ".txt"; # This didn't solve my problem # I also tried +< instead of >>, but nothing happened at all # when I did that # open my $filehandle, ">>$output_file"; # flock($filehandle, LOCK_EX); # So instead I tried the version in the comments on the # tutorial page although I don't really understand it :-( open my $semaphore, ">$output_file.lock"; flock($semaphore, LOCK_EX); open my $filehandle, ">>$output_file"; print $filehandle "line$number data data data\t"; close $filehandle; close $semaphore; }
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re: Threads and output files (locking)
by 7stud (Deacon) on May 14, 2011 at 18:30 UTC | |
# So instead I tried the version in the comments on the # tutorial page although I don't really understand it :-( A semaphore is like a rock sitting on the ground somewhere. Unless the process grabs the rock, it is not supposed to access the file. All the processes agree ahead of time that they won't try to access the file unless they have the rock. However, there is nothing preventing a rogue process from accessing the file directly -- the processes must take it open themselves to only access the file if they grab the rock first. When one process is done processing the file, it sets the rock down on the ground, and then the next process grabs the rock. So a semaphore has nothing to do with locking a file per se; the semaphore is the rock, and the processes simply agree that they have to grab the rock before accessing the file. I don't understand what code your Grid::Cluster processes are executing, so I can't help you there. It would probably behoove you to put aside the Grid::Cluster code and see if you can get locks to work on a simple program that spins off processes(or threads), which then sleep for a random amount of time before trying to write to a file. (Maybe that is what the second bit of code you posted is trying to do?) See if you are making any of the mistakes described here: http://perl.plover.com/yak/flock/ | [reply] |
|
Re: Threads and output files (locking)
by zentara (Cardinal) on May 14, 2011 at 22:47 UTC | |
I'm not really a human, but I play one on earth. Old Perl Programmer Haiku ................... flash japh | [reply] |
|
Re: Threads and output files (locking)
by BrowserUk (Patriarch) on May 15, 2011 at 13:37 UTC | |
My suggestion would be to start an extra process that opened four sockets or named pipes and have the other processes write to those. ps. You might have received more responses had you not mentioned threads in your title when you are not using threading. Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.
| [reply] |
|
Re: Threads and output files (locking)
by Anonymous Monk on May 14, 2011 at 21:19 UTC | |
Thanks for your answer. That tutorial was one of the ones I had already read. I admit the code I posted wasn't very good. I tried to simplify my code, but the result didn't run by itself. Using a semaphore helps, but there is still a lot of garbled output. I think part of the problem may be NFS, as mentioned in the File Locking tutorial. The output file on which I place the lock is in my home directory, which is mounted on all the cluster computers. I tried using File::NFSLock and File::SharedNFSLock, and ran some tests. Sometimes one module does better, sometimes the other, but none achieves 100%. In the best case, at least 100 out of 22000 lines are still jumbled together. So I guess I will have to go back to my previous approach, in which I printed the data to a bunch of files containing four lines each, and then postprocess the whole thing by concatenating them, sorting and grep'ping for the right lines to separate the output into the four desired output files. Oh well. At least I learned something in the process :-) | [reply] |
|
Re: Threads and output files (locking)
by marinersk (Priest) on May 15, 2011 at 04:51 UTC | |
I've attached my filelock.pm module here, maybe it will help you. It's worked for me under a variety of Linux systems, NetBSD, and a variety of Windows32 (W2K, WXP, Server 2000, Server 2003). It operates on the principle that anyone touching the file will first request a lock on it, and will release it when they are done. The lock itself is another file (the filename you request plus a .lck), so you'll need to have sufficient access to be able to create files in the directory where your target file is located. It does NOT have any robust timeout features; I wrote the module for myself and it worked for me so I simply trust me to only use it when I know all access points will respect the rules. Back of my mind to do list included enhancing it to permit timeouts and that sort of thing, but as yet I've not picked up the design brain to that end. In any regard, the module works for me in all environments but one. It used to work in that environment, but the hosting provider did some kind of OS switch and since then I have been getting errors about how the lock limit has been exceeded or somesuch. I don't understand why, but it's the only place I have ever had a problem with this module. Since the hosting provider was of little help, I've dropped the use of filelock.pm on that site. The application in question has two users and extremely low transaction volume, so we both just cross our fingers a lot. I have been contemplating finding another mechanism by which to establish locking without using flock() but haven't really put my fingers to the keyboard on it. Nonetheless, this worked for me for many years with this one web hosting provider being the only time it had failed, and only when they switched OS for their servers. Good luck! And I wouldn't mind hearing if you try using this to find out if it works or not.
| [reply] [d/l] |
by marinersk (Priest) on May 15, 2011 at 05:26 UTC | |
And here's a little test program you can use to ensure the locking itself is working properly. It's a heavily watered-down variant of the unit test program, but it seems to work. You can run it in two windows simultaneously and see how the locking interacts. You can also run the second copy with a "2" as its parameter, and it tests the first two files in reverse order. Not sure I remember why that seemed important at the time LOL. Examples:
$ filelock-sample.pl $ filelock-sample.pl 2
| [reply] [d/l] |