comment on

Anonymous Monk,
This is an incredibly trivial task in perl. Here is some code (untested) that will get you started. I have intentionally left some things unoptimized with comments since you know better than I what should actually happen.

#!/usr/bin/perl
use constant DAILY_RUN => 1024 * 1024 * 1000;
use strict;
use warnings;

my $file = $ARGV[0] or die "Usage: $0 <input_file>";
open(my $fh, '<', $file) or die "Unable to open '$file' for reading: $
+!";

my $cnt = 1;
my $out_file = "$file.$cnt";

# Will clobber an existing file by this name (fix if important)
open(my $out_fh, '>', $out_file) or die "Unable to open '$out_file' fo
+r writing: $!";

while (<$fh>) {
    if (-s $out_file > DAILY_RUN && /^100/) {
        ++$cnt;
        $out_file = "$file.$cnt";
        open($out_fh, '>', $out_file) or die "Unable to open '$out_fil
+e' for writing: $!";
    }
    print $out_fh $_;
}
[download]

Now it looks like your lines are fixed length so one optimization may be not to check after every write how big the file is but to wait until you have written enough to be at least 1 GB and set a flag to pay attention to the start of a 100 record. Additionally, this code writes at least 1GB and then starts a new file as soon as a 100 record is encountered - you may want to keep it under 1GB. Again, you are in a better position to address these than I am. Finally, it may be possible to process the record sets as a whole rather than a line at a time by setting $/ = "\n100"; That is an advanced technique that you can read about in perlvar. It complicates the code but it is presumably more efficient (less disk read/writes)

Cheers - L~R

In reply to Re: Can I split a 10GB file into 1 GB sizes using my repeating data pattern by Limbic~Region
in thread Can I split a 10GB file into 1 GB sizes using my repeating data pattern by Anonymous Monk

Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!

Titles consisting of a single word are discouraged, and in most cases are disallowed outright.

Read Where should I post X? if you're not absolutely sure you're posting in the right place.

Please read these before you post! —

Posts may use any of the Perl Monks Approved HTML tags:

a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr

You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)

	For:		Use:
	&		`&`
	<		`<`
	>		`>`
	[		`[`
	]		`]`

Link using PerlMonks shortcuts! What shortcuts can I use for linking?

See Writeup Formatting Tips and other pages linked from there for more info.