comment on

As a general thing only deal with the data you need to deal with immediately. In this case for phase one that means open your input file then while there is more data read a couple of lines and write them to the next output file. For phase 2 that means while there is another file read it and write its contents to your output.

Note there isn't a "for" there anywhere. It's all "while something". Let's see how that coould look:

#!usr/bin/perl
use strict;
use warnings;

=pod

Use this script as the input file to be cut up. We'll put the generate
+d files
into a throw away directory that is a sub-directory to the directory w
+e are
running the script from.

This script creates the split files and the rejoined text. The rejoine
+d text
doesn't get saved, but is compared to the original script as a check t
+hat
everything worked. 

=cut

my $subDir = './delme';

# Create a throw away sub-directory for the test. Wrapped in an eval b
+ecause
# we don't care if it fails (probably because the dir already exists).
eval {mkdir $subDir};

seek DATA, 0, 0; # Set DATA to the start of this file

my $origText = do {local $/; <DATA>}; # Slurp the script text to check
+ against

seek DATA, 0, 0; # Back to the start again

# Create the split files

my $fileNum = 0;

while (!eof DATA) {
    my $fileLines;
    
    $fileLines .= <DATA> for 1 .. 2;
    last if !defined $fileLines;

    ++$fileNum;
    
    open my $outFile, '>', "$subDir/outFile$fileNum.txt";
    print $outFile $fileLines;
    close $outFile;
}

# Join the files back up again
my $joinedText;

$fileNum = 1;

while (open my $fileIn, '<', "$subDir/outFile$fileNum.txt") {
    $joinedText .= do {local $/; <$fileIn>}; # Slurp the file
    ++$fileNum;
}

print "Saved and Loaded OK\n" if $joinedText = $origText;

__DATA__
[download]

The "slurp" bits set a Perl special variable to ignore line breaks so we can read an entire file in one hit. On modern systems with plenty of memory that works fine for files of hundreds of megabytes so it sould be fine for our toy example.

The for 1 .. 2 fetches 2 lines from the input file. If there is an odd number of lines in the input it doesn't matter - we end up concatenating undef to $fileLines which amounts to a no-op so no harm done.

Premature optimization is the root of all job security

In reply to Re: Splitting big file or merging specific files? by GrandFather
in thread Splitting big file or merging specific files? by zarath

Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!

Titles consisting of a single word are discouraged, and in most cases are disallowed outright.

Read Where should I post X? if you're not absolutely sure you're posting in the right place.

Please read these before you post! —

Posts may use any of the Perl Monks Approved HTML tags:

a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr

You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)

	For:		Use:
	&		`&`
	<		`<`
	>		`>`
	[		`[`
	]		`]`

Link using PerlMonks shortcuts! What shortcuts can I use for linking?

See Writeup Formatting Tips and other pages linked from there for more info.