Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

Hi Monks,

I would like to know the best way to append a file (huge,in Terra bytes) to another file.

Both source and destination could be of Terrabytes in Size.

As of now, am using
use File::Copy; open ( my $s, "<", "source" ) or die "Could not open source file: $!\n +"; open ( my $d , ">>", "dest" ) or die "Could not open destination file +: $!\n"; copy($s, $d) or die "Could not copy source to dest : $!\n"; (close $s && close $d ) or die "Could not close file handles : $!\n";

But some time, the above code says "Could not close file Input/Output error", and destination file is corrupted. I've noticed the line Note that passing in files as handles instead of names may lead to loss of information on some operating systems; it is recommended that you use file names whenever possible. from http://search.cpan.org/~rjbs/perl-5.16.0/lib/File/Copy.pm

Am not sure, whether passing file handle to copy is causing this issue.

Finally, I would want to know the right and best way to append a to another file with out consuming memory.

Thanks in Advance.

Replies are listed 'Best First'.
Re: Best way to append bigger files
by brx (Pilgrim) on Jul 24, 2012 at 15:21 UTC

    Here is a try without File::Copy.

    #!perl my $blocksize = (stat'.')[11] || 4096; open ( my $s, "<", "source" ) or die "Could not open source file: $!\n +"; open ( my $d , ">>", "dest" ) or die "Could not open destination file +: $!\n"; my $buffer; do { local $/ = \$blocksize; print {$d} $buffer while ( $buffer = <$s> ); }
    English is not my mother tongue.
    Les tongues de ma mère sont "made in France".
      Aren't these time consuming ways?
        Aren't these time consuming ways?

        Perhaps. This is something to test. If portability is not a priority, system "cat source >> dest" is probably better with unixes. OP wants to avoid memory issue:

        OP: Finally, I would want to know the right and best way to append a to another file with out consuming memory.

        So I presume read/write in buffer corresponding to blocksize is a good choice if we can know the good one. With files of such size (terrabytes) there is a question: do I really have to append files? :)

        English is not my mother tongue.
        Les tongues de ma mère sont "made in France".
Re: Best way to append bigger files
by 2teez (Vicar) on Jul 24, 2012 at 15:18 UTC

    open ( my $d , ">>", "dest" ) or die "Could not open destination file +: $!\n"; open ( my $s, "<", "source" ) or die "Could not open source file: $!\n +"; while(<$s>){ chomp; print {$d} $_,$/; } close $s or die "can't close file:$!"; close $d or die "can't close file:$!";

Re: Best way to append bigger files
by locked_user sundialsvc4 (Abbot) on Jul 24, 2012 at 21:01 UTC

    If I had “terabyte sized” files to deal with, then I would try mightily to avoid copying them at all.   It is well worth whatever effort it might take, IMHO, to build a list of filenames, and to arrange for whaever program it may be to switch from one file to the next.   I say this, not merely because of the time required to copy the data, but also the unwieldy unmanageability of such a large contiguous single file.

      ... the unwieldy unmanageability of such a large contiguous single file.

      Can you elaborate?

Re: Best way to append bigger files
by BrowserUk (Patriarch) on Jul 25, 2012 at 20:17 UTC

    FWIW: On my single-disk, Windows system, this is almost twice as fast as File::Copy.

    Here, using a 10 MB buffer seems to strike the right balance between latency and head shuffling; but if you are on a multi-disk system that balance point may be considerably different.

    #! perl -slw use strict; our $BUFSIZE //= 10*1024**2; my( $source, $target ) = @ARGV; open IN, '<', $source or die "$source: $!"; binmode IN; open OUT, '>>', $target or die "$target: $!"; binmode OUT; { local( $/, $\ ) = \$BUFSIZE; print OUT while <IN>; } close OUT; close IN;

    With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority".
    In the absence of evidence, opinion is indistinguishable from prejudice.

    The start of some sanity?