Re^2: How to restart a while loop that is iterating through each line of a file?

I have a large file (~3.4 million lines) that I am trying to break into 100,000 line chunks. The file header is needed for all of the smaller files, which is why I wanted the first 16 lines populated over 34 files.

Thanks for the suggestions monks! -cookersjs

Comment on Re^2: How to restart a while loop that is iterating through each line of a file?

Replies are listed 'Best First'.
Re^3: How to restart a while loop that is iterating through each line of a file? by Marshall (Canon) on Nov 29, 2016 at 21:59 UTC
Ok, I think I see what you are trying to do. Here is one way of many to code this sort of thing: #!usr/bin/perl use strict; use warnings; my $nHeaderLines = 2; my $nDataLinesPerFile = 4; my @header; # the "Big File" is the DATA segment below, # maybe millions of lines... for (1..$nHeaderLines) # read header lines from big file { my $header_line = <DATA>; push @header, $header_line; } # divide the big file data into smaller files, # each with the initial header... my $nFile = 0; my $fileNameBase = "SmallerFile"; my $nDataLine = 99999; my $line; while ($nDataLine++, defined ($line = <DATA>)) { if ($nDataLine > $nDataLinesPerFile) # start new file { $nFile++; my $name = "$fileNameBase$nFile.txt"; open (OUT, '>', "./$name") or die "$!"; print OUT @header; $nDataLine=1; } print OUT $line; } print "Program Done!\n"; __DATA__ Header 1 Header 2 data 1 data 2 data 3 data 4 data 5 data 6 data 7 data 8 data 9 [download] PS: This code has no advance knowledge of how many smaller files will be created. Nothing like "34 files" is hard coded. The code creates "as many files as are needed". In the case above, there are 3 files. data lines 1,2,3,4 in one file, data lines 5,6,7,8 in another and the last 9th data line in a third file.	[reply] [d/l]

Replies are listed 'Best First'.

Re^3: How to restart a while loop that is iterating through each line of a file?
by Marshall (Canon) on Nov 29, 2016 at 21:59 UTC

#!usr/bin/perl
use strict;
use warnings;

my $nHeaderLines = 2;
my $nDataLinesPerFile = 4;

my @header;

# the "Big File" is the DATA segment below,
# maybe millions of lines...

for (1..$nHeaderLines) # read header lines from big file
{
   my $header_line = <DATA>;
   push @header, $header_line;
}

# divide the big file data into smaller files,
# each with the initial header...

my $nFile = 0;
my $fileNameBase = "SmallerFile";
my $nDataLine = 99999;

my $line;
while ($nDataLine++, defined ($line = <DATA>))
{
   if ($nDataLine > $nDataLinesPerFile) # start new file
   {
       $nFile++;
       my $name = "$fileNameBase$nFile.txt";
       open (OUT, '>', "./$name") or die "$!";
       print OUT @header;
       $nDataLine=1;
   }
   print OUT $line;
}

print "Program Done!\n";

__DATA__
Header 1
Header 2
data 1
data 2
data 3
data 4
data 5
data 6
data 7
data 8
data 9
[download]

[reply]
[d/l]