#!/usr/bin/perl use constant DAILY_RUN => 1024 * 1024 * 1000; use strict; use warnings; my $file = $ARGV[0] or die "Usage: $0 <input_file>"; open(my $fh, '<', $file) or die "Unable to open '$file' for reading: $ +!"; my $cnt = 1; my $out_file = "$file.$cnt"; # Will clobber an existing file by this name (fix if important) open(my $out_fh, '>', $out_file) or die "Unable to open '$out_file' fo +r writing: $!"; while (<$fh>) { if (-s $out_file > DAILY_RUN && /^100/) { ++$cnt; $out_file = "$file.$cnt"; open($out_fh, '>', $out_file) or die "Unable to open '$out_fil +e' for writing: $!"; } print $out_fh $_; }
Now it looks like your lines are fixed length so one optimization may be not to check after every write how big the file is but to wait until you have written enough to be at least 1 GB and set a flag to pay attention to the start of a 100 record. Additionally, this code writes at least 1GB and then starts a new file as soon as a 100 record is encountered - you may want to keep it under 1GB. Again, you are in a better position to address these than I am. Finally, it may be possible to process the record sets as a whole rather than a line at a time by setting $/ = "\n100"; That is an advanced technique that you can read about in perlvar. It complicates the code but it is presumably more efficient (less disk read/writes)
Cheers - L~R
In reply to Re: Can I split a 10GB file into 1 GB sizes using my repeating data pattern
by Limbic~Region
in thread Can I split a 10GB file into 1 GB sizes using my repeating data pattern
by Anonymous Monk
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |