in reply to splitting a large text file and output
I've tried reading it in and setting up for loops and such, but I cannot get anything that works.What have you tried? What didn't work? What errors did you get? How do you know it didn't work? Post some code (wrapped in <code> tags), so we can help guide you to a working tool. This is not a code writing service.
Update: Now that you have updated your post with code, I can comment. First, the posted code with the posted input file yields the warnings:
Use of uninitialized value $_ in pattern match (m//) at fluff.pl line +26, <IN> line 9. Argument "1 1 1" isn't numeric in numeric ne (!=) at fluff.pl line 26, + <IN> line 9. Use of uninitialized value $_ in pattern match (m//) at fluff.pl line +26, <IN> line 9. Argument "2 2 2" isn't numeric in numeric ne (!=) at fluff.pl line 26, + <IN> line 9. Use of uninitialized value $_ in pattern match (m//) at fluff.pl line +26, <IN> line 9. Argument "3 3 3" isn't numeric in numeric ne (!=) at fluff.pl line 26, + <IN> line 9. Use of uninitialized value $_ in pattern match (m//) at fluff.pl line +26, <IN> line 9. Argument "" isn't numeric in numeric ne (!=) at fluff.pl line 26, <IN> + line 9. Use of uninitialized value $_ in pattern match (m//) at fluff.pl line +26, <IN> line 9. Argument "4 4 4" isn't numeric in numeric ne (!=) at fluff.pl line 26, + <IN> line 9. Use of uninitialized value $_ in pattern match (m//) at fluff.pl line +26, <IN> line 9. Argument "5 5 5" isn't numeric in numeric ne (!=) at fluff.pl line 26, + <IN> line 9. Use of uninitialized value $_ in pattern match (m//) at fluff.pl line +26, <IN> line 9. Argument "6 6 6" isn't numeric in numeric ne (!=) at fluff.pl line 26, + <IN> line 9.
This is because you've used numeric unequal (!=, Equality Operators) in place of the negative binding operator (!~, Binding Operators). This is problematic because without binding, the regular expression is tested against your uninitialized magic variable $_. What you actually meant is not that the line doesn't contain any whitespace, but rather that the line contains a character that is not whitespace. You can achieve this using the \S character class, so the block becomes:
elsif ($line =~ /\S/) { push (@arr, "$line\n"); next; }
If we run this, we get your intended outout, but as you say, are missing one output file. This can be resolved by adding a final call to your create_file sub, so the final, functional version would be:
#!/usr/bin/perl use strict; use warnings; my $infile = 'roegen6.vect'; my $count = 1; my $outfile = "$infile-section_$count.vect"; my @arr; sub create_file { open(OUT,">$outfile") or die "Error with outfile: $!\n"; print OUT @arr; close(OUT); @arr=(); $count++; $outfile="$infile-section_$count.vect"; } open(IN,$infile) or die "Error with infile $infile: $!\n"; my @data=<IN>; close(IN); foreach my $line (@data) { chomp($line); if ($line =~ /VECT/) { push (@arr, "$line\n"); next; } elsif ($line =~ /\S/) { push (@arr, "$line\n"); next; } else { push (@arr, "$line\n"); create_file(); } } create_file if @arr;
Note I've put a conditional on the final output, so it will only write if your buffer has content. Not quite how I would have written it from scratch, but it works.
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re^2: splitting a large text file and output
by Gulliver (Monk) on Jun 10, 2011 at 16:35 UTC |