0p3nfac3 has asked for the wisdom of the Perl Monks concerning the following question:

Hi,

I am having a major problem thinking my way out of this problem. I think the solution may be a simple one but my inexperiance with perl is really holding me back. I hope you folks can help me...

Here is the problem. I have very large files (reports) that are continually being generated the files can be on the order of 400 to 800 mb in length on average. In every file there will be a character set "DETS01". Essentially I want to create 2 new files. One with the info preceeding DETS01, one containing the info following DETS01 (essentially from DETS01 to EOF).

Getting the first part done is a breeze, using something on the order of:

#!/usr/bin/perl -w open (INPUT, "target.txt") || die "can't read from target:$!\n"; open (OUTPUT1, ">file1.txt") || die "can't write to file1.txt:$!\n"; # # while (<INPUT>) { last if $_ =~ /^DETS01/; print OUT "$_\n"; } close (INPUT); close (OUTPUT); exit;

I just can't seem to come to terms with scanning down to the DETS01 and then printing everything that comes after it to the EOF. I've tried using unless and until statements but I've got this freakin' mental block that just won't let me move forward with this. It seems so freakin' simple!!! I just can't get it!!!!

If someone can point me in the right direction I would be s o appreciative....

Thank You!!!! #

  • Comment on How to scan a file, find a character string and print from that string to EOF
  • Download Code

Replies are listed 'Best First'.
Re: How to scan a file, find a character string and print from that string to EOF
by tadman (Prior) on Aug 30, 2002 at 18:11 UTC
    Variation on myocom's technique. Could be a little more elegant with IO::File:
    use strict; use warnings; my ($input_file) = @ARGV; # Auto-generate output names based on input name my @output_file = map { "$input_file.$_" } 1..2; # Open up all the required filehandles open($input_fh, "<", $input_file) || die "Could not open $input_file\n"; my @out_fh; open ($out_fh[0], ">", $output_file[0]) || die "Could not write to $output_file[0]\n"; open ($out_fh[1], ">", $output_file[1]) || die "Could not write to $output_file[1]\n"; # Start by writing to the first filehandle my $fh = $out_fh[0]; while (<$input_fh>) { $fh = $out_fh[1] if (/^DETS01/); print $fh; } # Now close out everything foreach my $out_fh (@out_fh) { close($out_fh); } close($input_fh);
    This rotates filehandles if the string "DETS01" occurs. The first bit is in the first file, the second in the second. So the file "input.txt" makes "input.txt.1" and "input.txt.2".
Re: How to scan a file, find a character string and print from that string to EOF
by myocom (Deacon) on Aug 30, 2002 at 16:59 UTC

    Without changing your code too much, you might do something like this. The basic idea is that you have a flag telling you whether you've hit DETS01 yet or not, then flip the flag when you hit it.

    #!/usr/bin/perl -w open (INPUT, "target.txt") or die "can't read from target:$!\n"; open (OUTPUT1, ">file1.txt") or die "can't write to file1.txt:$!\n"; open (OUTPUT2, ">file2.txt") or die "can't write to file2.txt:$!\n"; # # my $firsthalf = 1; while (<INPUT>) { if ($_ =~ /^DETS01/) { $firsthalf = 0; } if ($firsthalf) { print OUTPUT1 $_; } else { print OUTPUT2 $_; } } close (INPUT); close (OUTPUT1); close (OUTPUT2);

    Edit: Removed commas after filehandles in prints. Thanks, tadman!

    "One word of warning: if you meet a bunch of Perl programmers on the bus or something, don't look them in the eye. They've been known to try to convert the young into Perl monks." - Frank Willison
Re: How to scan a file, find a character string and print from that string to EOF
by the pusher robot (Monk) on Aug 30, 2002 at 18:07 UTC
    How about something resembling:

    #!/usr/bin/perl -w open INPUT, "target.txt" || die "can't read from target:$!\n"; open OUTPUT1, ">file1.txt" || die "can't write to file1.txt:$!\n"; open OUTPUT2, ">file2.txt" || die "can't write to file2.txt:$!\n"; # # while (<INPUT>) { last if $_ =~ /^DETS01/; print OUTPUT1 "$_\n"; } while (<INPUT>) { print OUTPUT2 "$_\n" } close INPUT; close OUTPUT1; close OUTPUT2;
    This doesn't put the line containing DETS01 in either file, so change it if it should.
Re: How to scan a file, find a character string and print from that string to EOF
by Anonymous Monk on Aug 30, 2002 at 18:14 UTC

    Here's one way

    #!/usr/bin/perl -w use strict; open(OUT1, ">file1.txt") || die $!; open(OUT2, ">file2.txt") || die $!; while(<>){ print OUT1 if (1 .. /^DETSO1/) =~ /^\d+$/; print OUT2 if /^DETSO1/ .. eof; } __END__