eide has asked for the wisdom of the Perl Monks concerning the following question:

I am looping through files in a directory and printing to csv files with:
. . opendir D, $opts{d} or die ...; . . foreach my $file (readdir D) { my ($filebase, $dirname, $ext) = fileparse($file, '\..*'); my $csvFile = "$opts{d}\\$filebase.csv"; open FILE_OUT, ">$csvFile" or die ...; . . foreach my $parent (@parents) { print FILE_OUT $parents{$parent} . "\n"; } close FILE_OUT; }
The problem is it only writes out with the first file it comes to. I originally had the program writing out to the csv files line by line inside the code and everything worked fine, then I made some necessary code changes to incorporate the %parents hash and now it still creates the other csv files but they are all 0 bytes. No matter what combination of files there are in the directory it only writes data to the first one. Any ideas? Thanks in advance.

Replies are listed 'Best First'.
Re: readdir and print file issue
by ikegami (Patriarch) on Sep 28, 2007 at 18:59 UTC

    The only explanation I can see is that the omitted code changes @parents and/or %parents.

    By the way, are you sure @parents contains a subset of the keys of %parents? If you didn't care about order, you could do

    foreach my $parent (keys %parents) { print FILE_OUT $parents{$parent} . "\n"; }
Re: readdir and print file issue
by graff (Chancellor) on Sep 29, 2007 at 15:28 UTC
    The problem is here, I think:
    foreach my $file (readdir D) { my ($filebase, $dirname, $ext) = fileparse($file, '\..*'); my $csvFile = "$opts{d}\\$filebase.csv"; open FILE_OUT, ">$csvFile" or die ...;
    You need to filter the set of file names coming back from readdir, and not process every one of them. You are adding files to the directory on every iteration of the loop, and I believe those new files will be returned by readdir on subsequent iterations.

    Since you are not paying attention to what $ext is, you will probably get a $file with the value "filename.csv" (which had just been written on some previous iteration) and reopen it for output (truncating it to 0 bytes).

    Presumably, the logic to read the "input" file follows the open FILE_OUT, ">$csvFile", and so at the point when you try to read data from that "input" file, there isn't any. (Update: and even if there were still data in the file, it would not be in the format that you are expecting as input. What would the rest of your code do in that case?)

    Let's suppose that there is a consistent file extension that identifies all the proper input files (e.g. maybe it's something like ".txt"); then your loop should start like this:

    for my $file ( grep /\.txt$/, readdir D )
    or like this:
    for my $file (<$opts{d}/*.txt>) # use a file glob
    so that you never end up trying to input the "csv" files that you are creating as output. Or you could make sure to write your output files to a different directory (that's an approach I would prefer, personally).