thor has asked for the wisdom of the Perl Monks concerning the following question:

Greetings monks,

I submit the following question for your ponderance. I have written a perl script for work that exploits the naming conventions of sql files to determine on what server they have to run, and on what database they have to be executed on. I've run in to the problem where if a developer has many large files to run, then they can take a very long time, especially with endsite servers in Manila, Sydney, Madrid, etc... What I want to do is fork off a child for each server that has to be hit. My problem lies in the fact that the tool also logs its results to either a file or to STDOUT (users choice). How do I have it so that children don't "clobber" eachother trying to write to the same filehandle? I have a feeling that this has to do with flock, but don't know where to begin.

Thanks in advance,
thor

Replies are listed 'Best First'.
Re: Fork novice requests help...
by Abigail (Deacon) on Jun 20, 2001 at 02:20 UTC
    I fail to see the problem with flock. It's easy. However, there might be a way to avoid use of flock.

    If you are using log files, you will typically append to the files, instead of overwriting them each time you log something. Modern Unix kernels (I've no idea how non-Unix kernels deal with the problem) will garantee that writes done to files opened for append are atomic. That means that if you are printing small strings (typically strings not exceeding 512, 1024 or 4096 bytes) you will be fine - no clobbering.

    However, if that isn't the case, you still need to use flock. But that's really easy. Assume you open the file(s) for append, all you need is:

    use Fcntl ':flock'; .... # Code done in every child. open my $fh => ">> /path/to/log/file" or die "Failed to open: $!"; flock $fh => LOCK_EX or die "Failed to lock: $!"; print $fh $your_text; close $fh or die "Failed to close: $!";
    That's all there's to it! One use statement, to include the constant LOCK_EX and just one extra statement.

    The perlopentut manual should tell you more about flocking. You also might want to look into man perlipc.

    -- Abigail

Re: Fork novice requests help...
by chromatic (Archbishop) on Jun 20, 2001 at 00:34 UTC
    There are lots of solutions. You could open a pipe to each child process (instead of forking explicitly) and pass back information from the child to the parent. You could open a filehandle to a temporary file (there's a CPAN module for good temp files) and use that for each child... when you catch SIGCHLD, open the temporary file and append it to the log. You could tie a filehandle to a scalar (Tie::Handle, I believe) and print to that, though you might have to mark it as shareable. (Who knows how far the magic extends across forks?)

    If I were you, I'd go the temporary files route. Getting flocks to work well across forks seems tedious to me.

    Update: As Abigail points out, flocking after the fork is indeed the way to go. I'm backwards on this one! heh

      I was also thinking that getting flock to work might be a hassle. I have been working on this for a bit and may have come up with a reasonable solution. If I have a temporary variable that is reserved for the purpose of saying "filehandle is in use", then I could fork off at my leisure, and upon their return, check to see if this variable is 0 or 1. If it is 0, set it to 1, write the results, and set it back to 0. If it is 1, sleep for a bit and try again. What does everyone think of that?
        Well, that's easy. I don't think much of that. You would either have to work with threads, or shared memory segments to communicate the state of the variable between the various processes. But, since each of the processes can actually change the variable, you need to implement some form of locking mechanism on the variable.... So, to avoid the "hassle" of locking files (which is really trivial in Unix), you need threads (which don't really work in Perl), or System V shared memory segments (a hassle), and on top of that, you need some form of locking anyway.

        -- Abigail

Re: Fork novice requests help...
by E-Bitch (Pilgrim) on Jun 20, 2001 at 00:13 UTC
    Thor-
    I actually am pondering a similar problem. The best solution that I have come up with is to place all output into a hash as a string (have each entry of output be a new record in the hash), then when all children have finished execution, and control returns to the parent thread, shoot through the hash, and output every entry on a new line. This can be cumbersome if you are recording a lot of records, unfortunately.

    Hope this helps!
    E-Bitch


    <update>Okay, Abigail just schooled me.... I'm humbled. I Tried out my 'best solution' and as Abigail stated, I didnt quite understand threads in perl... Kudos Abigail</update>
      Did you try out your "best solution"? Because I fail to see how this is supposed to work. As soon as you fork, your data (that is, your variables), are copied - that is, both the parent and the child get a copy, and they do not share the variables.

      But you mentioned threads, so, unlike the original poster, you might use threads instead of forks. Then variables can be shared. However, if two threads can access the same variable, you will need locking on the variable - which means to avoid locking, you will need to do locking.

      -- Abigail