austinby has asked for the wisdom of the Perl Monks concerning the following question:

Dear perl users, I have a file from which I would like to create n sub-files and extract data from these sub-files. I am able to generate the n sub-files. But, I can't extract data from one of the sub-files (the last one). However, when I write the section of the code involved in extracting data from the sub-files into a separate script, I can get the data I am looking for. Can someone please let me know why this is happening? Here's part of the script:
my @file = <files*.txt>; my $file; my ($sum, $cti, $avg); my $tmp = "tmp.txt"; foreach $file (@file) { $sum = 0; $cti = 0; $avg = 0; open(FIL,"$file"); open(TMP,">$tmp"); while(<FIL>) { chomp; my ($hd, $md, $tl) = split(/\|/,$_); $sum += $tl; $cti++ } $avg = $sum / $cti; open(FIL,"$file"); while(<FIL>) { chomp; my ($hd, $md, $tl) = split(/\|/,$_); $tl = $tl / $avg; my $tlf = sprintf("%.4f",$tl); print TMP "$hd\|$md\|$tlf\n"; } rename("$tmp","$file"); }

Replies are listed 'Best First'.
Re: Batch processing of files
by ikegami (Patriarch) on Feb 11, 2010 at 17:38 UTC

    open(TMP,">$tmp"); clobbers the previous contents. Move it out of the loop if your goal is to accumulate the results of each file.

      Or you can leave it where it is now, and open the file for appending instead of clobbering the file:

      open(TMP,">>$tmp") or die "TMP open failed: $!";

      See perlopentut for more information.

      Update: Further reflection on what is posted would seem to imply this thread is not immediately pertinent to the OP.

        If you wanted to append to an exiting file, I'd still move the open outside of the loop. Why would you want to open the file repeatedly here?

        Anyway, both of our solution are wrong. It shouldn't be moved outside of the loop. I missed the rename originally.

Re: Batch processing of files
by toolic (Bishop) on Feb 11, 2010 at 17:39 UTC
    I can not see any obvious problem in your code, but you should check the status of your open and rename calls, just in case they are failing:
    open(TMP,">$tmp") or die "error opening file $tmp: $!";
    See also Basic debugging checklist
      Cleaned up code with error checking:
      use strict; use warnings; use Fcntl qw( SEEK_SET ); my @files = <files*.txt>; # Should really use File::Temp my $fn_out = "tmp.txt"; my $error = 0; for my $fn_in (@files) { open(my $fh_out, '>', $fn_out) or die("Can't create output file \"$fn_out\": $!\n"); open(my $fh_in, '<', $fn_in) or do { warn("Can't open input file \"$fn_in\": $!\n"); $error = 1; next; }; my $sum = 0; my $cnt = 0; while (<$fh_in>) { chomp; $sum += ( split /\|/ )[2]; ++$cnt; } my $avg = $sum / $cti; seek($fh_in, 0, SEEK_SET) or do { warn("Can't seek in input file \"$fn_in\": $!\n"); $error = 1; next; }; while (<$fh_in>) { chomp; my ($hd, $md, $tl) = split(/\|/, $_); my $tlf = sprintf("%.4f", $tl/$avg); print $fh_out "$hd|$md|$tlf\n"; } close($fh_in); close($fh_out) or do { warn("Can't save to output file \"$fn_out\": $!\n"); $error = 1; next; }; rename($fn_out, $fn_in) or do { warn("Can't rename \"$fn_out\" to \"$fn_in\": $!\n"); $error = 1; next; }; } exit($error);

        Thanks everyone.

        I think I found out what the problem was. Looks like I had to close all the sub-files, before populating the @files array.

        close(filehandle); #close newly generated files before array call. my @files = <substituent_R*.txt>;
        Austin-