Re: splitting a large text file and output

You say

I've tried reading it in and setting up for loops and such, but I cannot get anything that works.

What have you tried? What didn't work? What errors did you get? How do you know it didn't work? Post some code (wrapped in <code> tags), so we can help guide you to a working tool. This is not a code writing service.

Update: Now that you have updated your post with code, I can comment. First, the posted code with the posted input file yields the warnings:

Use of uninitialized value $_ in pattern match (m//) at fluff.pl line 
+26, <IN> line 9.
Argument "1 1 1" isn't numeric in numeric ne (!=) at fluff.pl line 26,
+ <IN> line 9.
Use of uninitialized value $_ in pattern match (m//) at fluff.pl line 
+26, <IN> line 9.
Argument "2 2 2" isn't numeric in numeric ne (!=) at fluff.pl line 26,
+ <IN> line 9.
Use of uninitialized value $_ in pattern match (m//) at fluff.pl line 
+26, <IN> line 9.
Argument "3 3 3" isn't numeric in numeric ne (!=) at fluff.pl line 26,
+ <IN> line 9.
Use of uninitialized value $_ in pattern match (m//) at fluff.pl line 
+26, <IN> line 9.
Argument "" isn't numeric in numeric ne (!=) at fluff.pl line 26, <IN>
+ line 9.
Use of uninitialized value $_ in pattern match (m//) at fluff.pl line 
+26, <IN> line 9.
Argument "4 4 4" isn't numeric in numeric ne (!=) at fluff.pl line 26,
+ <IN> line 9.
Use of uninitialized value $_ in pattern match (m//) at fluff.pl line 
+26, <IN> line 9.
Argument "5 5 5" isn't numeric in numeric ne (!=) at fluff.pl line 26,
+ <IN> line 9.
Use of uninitialized value $_ in pattern match (m//) at fluff.pl line 
+26, <IN> line 9.
Argument "6 6 6" isn't numeric in numeric ne (!=) at fluff.pl line 26,
+ <IN> line 9.
[download]

This is because you've used numeric unequal (!=, Equality Operators) in place of the negative binding operator (!~, Binding Operators). This is problematic because without binding, the regular expression is tested against your uninitialized magic variable $_. What you actually meant is not that the line doesn't contain any whitespace, but rather that the line contains a character that is not whitespace. You can achieve this using the \S character class, so the block becomes:

  elsif ($line =~ /\S/) {
      push (@arr, "$line\n");
      next;
  }
[download]

If we run this, we get your intended outout, but as you say, are missing one output file. This can be resolved by adding a final call to your create_file sub, so the final, functional version would be:

#!/usr/bin/perl
use strict;
use warnings;

my $infile = 'roegen6.vect';
my $count = 1;
my $outfile = "$infile-section_$count.vect";
my @arr;

sub create_file {
    open(OUT,">$outfile") or die "Error with outfile: $!\n"; 
    print OUT @arr;
    close(OUT);
    @arr=();
    $count++;
    $outfile="$infile-section_$count.vect";
}

open(IN,$infile) or die "Error with infile $infile: $!\n";
my @data=<IN>;
close(IN);

 foreach my $line (@data) {
  chomp($line);

  if ($line =~ /VECT/) {
      push (@arr, "$line\n");
      next;
  }
  elsif ($line =~ /\S/) {
      push (@arr, "$line\n");
      next;
  }
  else {
    push (@arr, "$line\n");
    create_file();
  }
 }
create_file if @arr;
[download]

Note I've put a conditional on the final output, so it will only write if your buffer has content. Not quite how I would have written it from scratch, but it works.

Comment on Re: splitting a large text file and output Select or Download Code

Replies are listed 'Best First'.
Re^2: splitting a large text file and output by Gulliver (Monk) on Jun 10, 2011 at 16:35 UTC
The original post has been updated with some code but not enough of the input file was provided and kennethk's other questions weren't answered. The description is confusing because you don't tell us things like what defines a header, what kind of data comes after the header, etc. Are the headers always all text? Or are there numbers or punctuation? Is the data always just numbers? If you can explain the problem in English then you are halfway there for writing the code. My guess as to why it won't output the last section is because your existing code only writes to output when it sees a blank line. If the input file ends without a blank line then no output.	[reply]

Replies are listed 'Best First'.

Re^2: splitting a large text file and output
by Gulliver (Monk) on Jun 10, 2011 at 16:35 UTC

The original post has been updated with some code but not enough of the input file was provided and kennethk's other questions weren't answered.

The description is confusing because you don't tell us things like what defines a header, what kind of data comes after the header, etc. Are the headers always all text? Or are there numbers or punctuation? Is the data always just numbers? If you can explain the problem in English then you are halfway there for writing the code.

My guess as to why it won't output the last section is because your existing code only writes to output when it sees a blank line. If the input file ends without a blank line then no output.

[reply]