Hi All. Junior Perl coder here. Trying to write a script to split a large file of possibly millions of similar records into multiple smaller files. Have it for the most part but there is one caveat. I cannot split same record types across files. So for example, if I have 1000 record type "A" appearing sequentially in the large file, and I reach my defined "smaller" file size while in the middle of said "A" records, I need to continue adding to the smaller file with the "A"s until I reach the next record type in the large file at which time I would want to start the next "small" file. I am having trouble with that part of the script. I cannot figure out how to make record pointer advance to the next record in the "big" file so that I can "step" through record by record until I find the next record type which will be my trigger to start te next "small" file. Following is my code. Any help would be greatly appreciated.
foreach $filename (@prfiles) { chomp $filename; $total_recs=0; $counter_recs=0; $previous_rec_ssn=0; $file_size_met="N"; open (INFILE1, '<', "$filename") or print "Cannot open $filename. +"; print "Now processing: $filename\n"; while (<INFILE1>) { if ($file_size_met ne "Y") { $file_count=1; $mod_filename="$filename\.$file_count" ; print "Writing output to: $mod_filename\n\n"; open (OUTFILE1, ">>" . "$mod_filename") or exit(201); while ($counter_recs < 50000) { print OUTFILE1 $_; $total_recs=($total_recs + 1); $counter_recs=($counter_recs + 1); } print "$total_recs records have been processed\n"; $counter_recs=0; } $actual_size = (stat($mod_filename))[7]; if ($actual_size >= $outsize){$file_size_met="Y"}; print "Current file size is $actual_size bytes.\n"; # ***** THIS IS WHERE I AM GOING OFF THE RAILS ***** if (($actual_size == $outsize) or ($actual_size > $outsize) an +d ($file_size_met eq "Y")) { @line_contents = split (/\|/,$_); $record_ssn=($line_contents[6]); print "current SSN is $record_ssn.\n"; if ($previous_rec_ssn == $record_ssn) { print OUTFILE1 $_; $previous_rec_ssn=$record_ssn; print "previous SSN is $previous_rec_ssn.\n"; print "current SSN is $record_ssn.\n"; } next; } next; } close INFILE1; close OUTFILE1; print "\n$total_recs records were processed for $filename\n\n"; }
In reply to Large FIle Splitter by insta.gator
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |