http://qs1969.pair.com?node_id=716973

why_bird has asked for the wisdom of the Perl Monks concerning the following question:

Dear Monks,

I have a perl script that is (re-)writing a file using Tie::File, then calling a process to run in the background with this file as the input. I do this in a loop, so that I get a different version of the file for each runthrough of the process.

The problem is (I think..) that the file has not always finished writing by the time I call the next process. How do I wait for a file to finish writing before calling my process? I do not think I am suffering from buffering. I have $|=1. Do I maybe need to somehow set this for the individual file instead?

Perhaps I have got completely the wrong end of the stick and I have another problem here.. if so, does anyone have any suggestions?

This code should illustrate my problem:

#!/usr/bin/perl #file: plet_test.pl use strict; use warnings; use Data::Dumper; $|=1; &do_something(@ARGV); exit 0; sub do_something { open(FILE,'<',$_[0]) || die; my $line=<FILE>; print "$line\n"; }
#! /usr/bin/perl #file: call_stuff.pl use strict; use warnings; use Tie::File; $|=1; for(my $i=1;$i<=10;$i++){ my @array=(); tie(@array,'Tie::File',"my_file.txt"); $array[1]="this is a line!"; $array[2]="woah, another line"; $array[3]="more lines??"; $array[4]="and so on...."; $array[0]="this is the first line. it is number $i. haha!"; untie(@array); system('./plet_test.pl my_file.txt &'); } exit 0;
On my system, this gives the output:
this is the first line. it is number 2. haha! this is the first line. it is number 4. haha! this is the first line. it is number 5. haha! this is the first line. it is number 6. haha! this is the first line. it is number 7. haha! this is the first line. it is number 8. haha! this is the first line. it is number 9. haha! this is the first line. it is number 10. haha! this is the first line. it is number 10. haha! this is the first line. it is number 10. haha!
Which is the same type of problem as I get with my proper script. Hope this demonstrates what I mean.
thanks
why_bird

update: I realise I've asked a similar question before (albeit in a more rambling fashion). I thought untie(@array) would have the same effect as close(FILEHANDLE), but it doesn't seem to. Making sure I'd close'd my file before did the trick, but untie-ing doesn't seem to have the same effect.

update 2: Actually, as Moritz pointed out, the file is being written too early, not too late.. d'oh

........
Those are my principles. If you don't like them I have others.
-- Groucho Marx
.......

Replies are listed 'Best First'.
Re: Waiting for a file to be written
by moritz (Cardinal) on Oct 14, 2008 at 13:31 UTC
    I'm pretty sure that the problem isn't that my_file.txt isn't fully written when you call system, but rather that it's modified by the next iteration of the parent script before the child script reads it fully.

    The solution is to either use locking, or not to spawn the child script in the background.

    (If you want to learn more about locking Super Search for it, we had many discussion about that).

      Yes, of course you are right, according to my own output!

      Thanks for the correction, and suggestions :)

      I need to run the child processes in the background (it's a series of long jobs controlled by a queue manager) so I will look into locking. sleep(10) works too, though it's a rough hack...
      ........
      Those are my principles. If you don't like them I have others.
      -- Groucho Marx
      .......
        Trivial 2-party locking mechanism:
        Why not have the child delete the file when it is finished with it?
        The parent can then wait for the file to disappear before creating the next version of it.
Re: Waiting for a file to be written
by jethro (Monsignor) on Oct 14, 2008 at 13:28 UTC

    Sorry to tell you, but you are suffering from buffering ;-). Tie::File has its own buffering scheme (which I found out simply through reading the man page of Tie::File). Look for "Deferred Writing". The solution is to use the flush method of Tie::File at appropriate moments or turn of the buffer completely with tie @array, 'Tie::File', $file, autodefer => 0;.

    UPDATE: moritz has the right answer. You might think about using a different copy of the file for each of the subprocesses, if your file isn't too big

      Thank you, this was the information I was hoping for, until Moritz pointed out that it was not in fact the information I needed. Damn buffering!!
      ........
      Those are my principles. If you don't like them I have others.
      -- Groucho Marx
      .......