in reply to Splitting big file or merging specific files?
As a general thing only deal with the data you need to deal with immediately. In this case for phase one that means open your input file then while there is more data read a couple of lines and write them to the next output file. For phase 2 that means while there is another file read it and write its contents to your output.
Note there isn't a "for" there anywhere. It's all "while something". Let's see how that coould look:
#!usr/bin/perl use strict; use warnings; =pod Use this script as the input file to be cut up. We'll put the generate +d files into a throw away directory that is a sub-directory to the directory w +e are running the script from. This script creates the split files and the rejoined text. The rejoine +d text doesn't get saved, but is compared to the original script as a check t +hat everything worked. =cut my $subDir = './delme'; # Create a throw away sub-directory for the test. Wrapped in an eval b +ecause # we don't care if it fails (probably because the dir already exists). eval {mkdir $subDir}; seek DATA, 0, 0; # Set DATA to the start of this file my $origText = do {local $/; <DATA>}; # Slurp the script text to check + against seek DATA, 0, 0; # Back to the start again # Create the split files my $fileNum = 0; while (!eof DATA) { my $fileLines; $fileLines .= <DATA> for 1 .. 2; last if !defined $fileLines; ++$fileNum; open my $outFile, '>', "$subDir/outFile$fileNum.txt"; print $outFile $fileLines; close $outFile; } # Join the files back up again my $joinedText; $fileNum = 1; while (open my $fileIn, '<', "$subDir/outFile$fileNum.txt") { $joinedText .= do {local $/; <$fileIn>}; # Slurp the file ++$fileNum; } print "Saved and Loaded OK\n" if $joinedText = $origText; __DATA__
The "slurp" bits set a Perl special variable to ignore line breaks so we can read an entire file in one hit. On modern systems with plenty of memory that works fine for files of hundreds of megabytes so it sould be fine for our toy example.
The for 1 .. 2 fetches 2 lines from the input file. If there is an odd number of lines in the input it doesn't matter - we end up concatenating undef to $fileLines which amounts to a no-op so no harm done.
|
|---|