Update: Added use autodie; to the code.
Hi live4tech,
Three points:
-
You should Choose a Good, Descriptive Title for your posts.
-
It’s not a good idea to try to match on \r\n, as this brings in too many complications (as well as being non-portable). Much better to strip these first, then add them back only when needed (i.e., when printing). See the code below.
-
There is a one-off error in your logic in the final else clause: $linenum is set to 0, but it should be 1, as a line is immediately written to file.
That said, I’m still not clear on how you could be getting files with, e.g., 299,701 rows. The suggestion of Anonymous Monk that it’s because you skip the empty lines doesn’t persuade, as there are (according to your specification) as many blank lines as there are data entry lines; and your logic ignores blank lines anyway.
I offer the following in the hope that it may do what you need:
#!perl
use strict;
use warnings;
use autodie;
my $pre = $ARGV[0];
my $max_lines = 300_000;
my $linenum = 0;
my $filenum = 0;
open my $fileout, '>', $pre . '-' . $filenum;
while (my $line = <>)
{
$line =~ s/ \s* $ //x; # remove trailing whitespace (incl. "\
+r\n")
if ($line ne '') # ignore blank lines
{
if ($linenum++ < $max_lines)
{
print $fileout $line, "\n";
}
else
{
close $fileout;
open $fileout, '>', $pre . '-' . ++$filenum;
print $fileout $line, "\n";
$linenum = 1;
}
}
}
close $fileout;
HTH,
Athanasius <°(((>< contra mundum
Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
Read Where should I post X? if you're not absolutely sure you're posting in the right place.
Please read these before you post! —
Posts may use any of the Perl Monks Approved HTML tags:
- a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
| |
For: |
|
Use: |
| & | | & |
| < | | < |
| > | | > |
| [ | | [ |
| ] | | ] |
Link using PerlMonks shortcuts! What shortcuts can I use for linking?
See Writeup Formatting Tips and other pages linked from there for more info.