in reply to Those fork()ing flock()ers...

Sounds like you could avoid locking completely and just open the file for "append" access in the parent and use syswrite to ensure that each line is appended in a single I/O operation.

        - tye (but my friends call me "Tye")

Replies are listed 'Best First'.
Re: (tye)Re: Those fork()ing flock()ers...
by ferrency (Deacon) on Dec 05, 2001 at 02:33 UTC
    This sounds like a good solution, and would probably do what I need.

    So, syswrite() and/or the underlying "write" system call is/are atomic?
    Isn't it possible for syswrite() to write less than the whole data it's told to write? If so, then how is this different than using auto-flush and writing with a print() call to the filehandle?

    (at some level, aren't you going to run into a size limitation which prevents write() from completing the write atomically due to hardware buffering or something? Or is that sort of like saying, "eventually it'll fail when the earth falls into the sun"?)

    Thanks. This might be just the solution I was looking for, but I don't think I understand syswrite() and the write() system call well enough to know for sure :)

    Alan

    Update:
    Looking in Programming Perl, I think this doesn't totally solve the problem. It gives me control in cases where a full write can't be done, but it doesn't prevent this from happening:
    You must be prepared to handle the problems that standard I/O normally handles for you, such as partial writes.

    (if it Happens To Work with syswrite, that may only be for the same reason it Happens To Work when I use stdio and no locking...)

      Yes, appending is atomic. From "man 2 write":

      If the O_APPEND flag of the file status flags is set, the file offset will be set to the end of the file prior to each write and no intervening file modification operation will occur between changing the file offset and the write operation.

      A partial syswrite is possible, but, for the case of "regular" files, means that writing the rest of the data is going to fail anyway (unless the resource exhaustion that caused the initial partial write is resolved in the interim).

      I was about to update my node with the following alternative when I noticed your reply. Simply reopen the file once in each child and use flock as usual. The reason that flock doesn't work is because the file descriptors are all duplicates of each other. The documentation I was able to find on flock really sucked at explaining it (as far as I'm concerned, the Linux version was simply incorrect). But if that didn't work, then flock would be useless. (:

      (Updated to add "once" above.)

              - tye (but my friends call me "Tye")
        I wasn't sure what conditions could cause a "partial write." If it's only a disk-full condition, and not a hardware-buffer-full condition or something, then syswrite will probably do what I want in all the cases I care about. Thanks!

        The alternative method you suggest is the solution I'm currently using. It works just fine, but I don't like having to reopen the files all the time. I'm probably just picky :)

        Thanks,
        Alan

        Update: Ah, sorry for the misunderstanding. Someone else suggested the same alternative in another thread (open each file only once, and flock many times, in each child).

        I think the best solution is a combination of that, with a redistribution of the problem space across my child processes. Instead of sending each child only one sub-problem and then letting it die, I could avoid one fork per sub-problem by giving each of my N children 1/Nth of the problem space all at once. If I do that, opening the file once per child will make a lot more sense, because each child will be doing a lot more writing. As it is now, each child only writes a few lines, and does not always write to every possible file, so there may be more opens if I open every file once per child, than if I continued to open each file only when I append a line :)

        Thanks!