to.b has asked for the wisdom of the Perl Monks concerning the following question:

Dear Monks,

In my Perl script I execute an external program via the system() function. Subsequently I process the output file of this program.

This works perfectly on my local machine but fails on NFS terminals because the output file is not readily available when Perl returns after the system() call due to a delay in writing files on this system. My script dies because it cannot open the output file.

To avoid this problem, my program sleeps while the output file doesn't -exist. Unfortunately, this approach is only half-solving my problem because sometimes the output file exists but its content is not written yet. In this case my script dies because the file is empty. To solve this problem again, my first idea was to check if the file is still opened by the external program, but I don't know how to test this with Perl.

What's the preferred way to test for open files? Maybe there's another approach to solve the whole problem? I hope somebody can help me.

Thanks in advance

 to.b

P.S: I use Perl 5.6.1 on UNIX.

Replies are listed 'Best First'.
Re: Waiting for delayed output after system()
by Zaxo (Archbishop) on May 06, 2004 at 14:42 UTC

    You could check file size as well as existence with -e && -s. That may still not be enough if the file is larger than its write buffer. Do you know what size the output file will be? -s returns the size.

    I'm pretty sure lsof won't help over NFS.

    After Compline,
    Zaxo

      First of all thanks for your fast reply.

      Unfortunately the programs output I'm interested in is only available as a file. And this file doesn't have a fixed size.

      lsof and subsequent grep would help over NFS, but I wanted to code portable. So I decided not to use this possibility.

      I think your idea to check the file size regularly is very good although not 100% sure. I will try this.

      Thanks again!

       to.b

Re: Waiting for delayed output after system()
by Joost (Canon) on May 06, 2004 at 14:41 UTC
    If the external program can be persuaded to print to STDOUT instead of a file, you can do:
    open PRG,"/some/prog|" or die "Cannot fork prog"; while (<PRG>) { # do something with the line in $_ } close PRG;
    Otherwise, you could try a test for the file size every X seconds, and see if it changes, and assume it's done when the file size doesn't change anymore. (this is an heuristic, not a 100% reliable way of doing this)

    Don't know if it's possible to see open files over NFS, does any monk have some insight?

    Joost.

Re: Waiting for delayed output after system()
by ozone (Friar) on May 06, 2004 at 15:11 UTC
    A possible solution is to create a second file after you've closed the first. This second file would contain some specific string (possibly an MD5 checksum of the first). The client can wait for that second file to appear and then when it contains the string, it can open the first file, knowing that it's fully complete.
Re: Waiting for delayed output after system()
by bluto (Curate) on May 06, 2004 at 15:48 UTC
    Unless I'm missing something, it sounds like the child process continues to run after system() returns (e.g. you are using '&' in your system call or the child itself spawns other children that live past it). The reason I say this is because NFS is no different than other filesystems -- the write calls to it must complete before they return to writing process. So this isn't really an NFS issue and may eventually bite you even without NFS.

    If you have control of the child's code, you should be able to make sure it finishes completly before returning from system by making sure it isn't doing anything in the background.

    If you don't control the child's code, but know all of the process PIDs, you can wait for them to die off. Otherwise, you'll probably have to pick a method of polling (as the other monks here have mentioned).

Re: Waiting for delayed output after system()
by dave_the_m (Monsignor) on May 06, 2004 at 15:18 UTC
    You could get the external program (or a wrapper for it) to print the file's final size to STDOUT. Then replace system() with backticks, eg
    my $size = `rsh remotehost external_program`; chomp $size; sleep 1 while (! -f $file or -s $file < $size);