batcater98 has asked for the wisdom of the Perl Monks concerning the following question:

I have a software package that builds a comma delimited flat file - I am wanting to write a routine that will check the status of this file for when the software is done and no longer writing to it. I want to have it check to see if maybe the file is open or closed if still open - wait 10 mins and check again. Stay in this loop until the file is closed. Once I know the files is done I want to create a flag or trigger file so I can trigger other jobs to run based on the creation of this trigger file. Ideas? Thanks, Ad.

Replies are listed 'Best First'.
Re: How to check a files Status?
by ikegami (Patriarch) on Apr 23, 2010 at 15:59 UTC

    Does the generator lock the file? You could check to see if the file is locked.

    If you can, have the software package create the file in one directory, and rename it into the final directory when its done. Since rename is atomic, the file will be complete as soon as it appears in the final directory. The background process can even ask the system to notify it when the directory changes so that it can respond faster than once every 10 minutes.

Re: How to check a files Status?
by AR (Friar) on Apr 23, 2010 at 16:00 UTC

    How do you define "open" and "closed"? You may want to have the software package write a well defined trailer at the end of the file right before it completes. That way, there is no ambiguity of whether the software is done.

    If you have no control over the software package itself or the format of the file, one idea is to get some statistic about the file (last line or byte size), sleep, get the same statistic again and compare it to the last one. Loop that until the statistic didn't change. The problem with this is it is not certain to work unless you can guarantee that the software package writes more often than your script wakes up.

    Edit: I like ikegami's solution better.

Re: How to check a files Status?
by ig (Vicar) on Apr 24, 2010 at 06:17 UTC

    There are packages that do job scheduling, including handling dependencies between jobs and recovery from failures. You might investigate some of them.

    A fundamental issue is whether you want the various jobs to be aware of each other, or whether you want the dependencies and coordination to be handled externally, as a job scheduler does.

    I would try to avoid polling. It is inefficient and/or introduces latency. If you don't want the package that builds the file to initiate the subsequent jobs when it is done, you might consider having the package that starts that package do so synchronously and run the following jobs when the first one exits.

    To determine whether some other process has the file open, I would try to open the file and set an exclusive lock on it. If any other process has it open, the attempt to obtain an exclusive lock will fail. If the other process has an exclusive lock, you might not even be able to open the file.

    You say "other jobs". Depending on how they interact, you may have to be careful to avoid deadlock, where every job is waiting on some resource held by some other job and nothing can progress. If each job uses (locks for exclusive access) only one resource at a time, this won't be a problem.

Re: How to check a files Status?
by repellent (Priest) on Apr 24, 2010 at 08:59 UTC