http://qs1969.pair.com?node_id=473534

rogue90 has asked for the wisdom of the Perl Monks concerning the following question:

Hi Monks - long time listener, first time caller. I have a simple script that outputs simple text to a file. Easy enough. When that file nears 50k, I want to close it and start outputting to a new file. stat won't work because the output is in buffer and not actually written to the file yet. I can't seem to find a solution other than processing the file again after I close it. Any ideas? Happy Friday :) -r90

Replies are listed 'Best First'.
Re: limit output filesize
by Transient (Hermit) on Jul 08, 2005 at 18:57 UTC
    I ran this as a test:
    #!/usr/bin/perl use warnings; use strict; my $file_base = "output"; open( FILE, ">$file_base" ) or die "Unable to open $file_base for writ +ing!\n$!\n"; for (1..100) { print FILE "0" x 1024; print "File Size: ".(stat($file_base))[7]."\n"; last if (stat($file_base))[7] > 50000; } close FILE or die "Error closing $file_base\n$!\n";
    the output was as follows (abridged):
    File Size: 0 File Size: 0 File Size: 0 File Size: 0 File Size: 4096 File Size: 4096 File Size: 4096 File Size: 4096 ... File Size: 49152 File Size: 49152 File Size: 49152 File Size: 53248
    So it stopped close to, but not at, 50k because while the file is open, only full blocks are written. You may want to keep that in mind and make sure the cutoff is at the highest blocksize that is less than your threshold.

    This is AIX 5.1 on a JFS2 file system. I won't be able to speak for others

    Update:
    as an aside $|++ did nothing for this
    You could also keep tabs on your output size as well, if it is all under your control.
      I see - I was using the filehandle for stat. Thanks much!
        Urm... the filehandle for stat, actually for -s, works for me.
        open OUT, ">test.txt"; for (1 .. 1000) { print OUT "Hello, Perlmonks!\n" x 10; print -s OUT; }
        It works, but -s is indeed set according to the buffer size, and not according to the already printed output:
        0
        0
        0
        0
        0
        0
        0
        0
        0
        0
        0
        0
        0
        0
        0
        0
        0
        0
        0
        0
        0
        4096
        4096
        4096
        4096
        4096
        4096
        4096
        4096
        ...
        

        With $| set to a true value it'll work less coarse, but most likely quite a bit slower:

        open OUT, ">test.txt"; my $fh = select OUT; $| = 1; select $fh; for (1 .. 1000) { print OUT "Hello, Perlmonks!\n" x 10; print -s OUT; }
        192
        384
        576
        768
        960
        1152
        1344
        1536
        1728
        1920
        2112
        2304
        ...
        
        I think the coarse version is still fine enough.
Re: limit output filesize
by davidrw (Prior) on Jul 08, 2005 at 18:57 UTC
    can you show how your writing to the file? You say stat won't work cause of buffering .. can you just autoflush with $|++ ?

    Asking for your code to see if with the print statements it can keep a counter of the bytes written out.. so right there you can close the handle and reopen w/a different filename.
Re: limit output filesize
by QM (Parson) on Jul 08, 2005 at 18:58 UTC
    Something like this?
    my $output_file = 'output_0000'; my $file_size = 0; open(STDOUT,>,$output_file) or die "Error opening $output_file for writing, "; while (my $line = <>) { do_something($line); if ($file_size > 50000) { close STDOUT or die "Error closing $output_file, "; $output_file++; # magic increment open(STDOUT,>,$output_file) or die "Error opening $output_file for writing, "; $file_size = 0; } print $line; }

    -QM
    --
    Quantum Mechanics: The dreams stuff is made of

Re: limit output filesize
by Forsaken (Friar) on Jul 09, 2005 at 05:37 UTC
    i'm anything but an expert, but I would likely go for a solution where I used a single scalar to keep track of how many bytes were already written, and before each write check if ($bytes_written + $current_line) > 51200. If that's the case, hang on to $currentline for the moment, close the file I was writing to nice and tidy and open the next one. Might be tedious, but it's simple :-) Not sure how efficient it would be though.


    Remember rule one...
Re: limit output filesize
by TedPride (Priest) on Jul 09, 2005 at 07:08 UTC
    I agree with Forsaken. Just do something like the following:
    use strict; use warnings; my $fname = 'myfile'; my $limit = 50 * 1024; my ($size, $fc, $handle, $printstr) = (0, 1); open ($handle, ">$fname$fc"); for (1..100) { $printstr = '-'x1000 . "\n"; if ($limit < $size + length($printstr)) { close($handle); $size = 0; $fc++; open ($handle, ">$fname$fc"); } print $handle $printstr; $size += length($printstr); } close($handle);
    If you have control over your prints, you also have the ability to check the lengths of the strings you're printing, which you can then use to keep track of how large the file is.

    Of course, you'll need to do (one) initial stat if you begin by appending to an existing file.