kevyt has asked for the wisdom of the Perl Monks concerning the following question:

I have always had problems reading files from a directory and I hope someone can help. I think the problem is in the the "/" in the open command.
my $dir = 'input_files'; opendir(DIR, $dir) or die "can't opendir $!"; while (defined(my $file = readdir(DIR))) { # do something with "$d +irname/$file" print "The directory and file are $dir/$file\n"; open (IN, "< $dir/$file" ) or die ("Can't open input file $!"); while ( < IN > ){ print ; } } closedir(DIR);
File names
allergies immunizations
Test file content
# Input file for allergies # Date, Diagnosed By, Type, allergy, reaction,specifics<br> 2009-05-16,Children's Hospital Boston,drugs,penicillin,Blue rash,This +only happens on weekends<br> 2009-05-17,Boston Medical Group,drugs,Vitamin B,Rash on torso,This hap +pens after 9PM<br> 2009-05-17,Memorial Hospital,food,Diary,Upset stomach and gas, Happens + after drinking whole milk<br>
output:
perl create_indivo_schemas.pl Name "main::IN" used only once: possible typo at create_indivo_schemas +.pl line 18. The directory and file are input_files/allergies INThe directory and file are input_files/immunizations INThe directory and file are input_files/.. INThe directory and file are input_files/.
open (my $fh, "< $dir/$file" ) or die ("Can't open input file $!"); while ( < $fh > ){ print ; }
produces this output: perl create_indivo_schemas.pl
The directory and file are input_files/allergies
GLOB(0x8dbaa14)The directory and file are
input_files/immunizations
GLOB(0x8dd4214)The directory and file are input_files/..
GLOB(0x8e33a2c)The directory and file are input_files/.
This also will not work
open (my $fh, "< input_files/immunizations" ) or die ("Can't open inp +ut file $!");
I also tried this:
my @file_list = `ls input_files`; # get the list of files from the i +nput directory foreach (@file_list){ # loop through the files that are in the direct +ory chomp; my $filename = 'input_files/'.$_; print "file name is : $filename\n"; open (my $fh, "< $filename" ) or die ("Can't open input file $!"); open (my $fh, "< $filename" ) or die ("Can't open input file $!"); while ( < $fh > ){ # look through the data in the file print; print "\n"; }
Can someone tell me what's wrong? I think I had a problem with this a year ago but I was able to get it to work on windows. So, that makes me think it's not the code. Thanks My font on this post seems a bit large. Sorry.

Replies are listed 'Best First'.
Re: reading files from a directory
by toolic (Bishop) on Oct 28, 2010 at 02:20 UTC
    Get rid of the whitespace surrounding your IN filehandle. Change:
    while ( < IN > ){

    To:

    while ( <IN> ){

    B::Deparse can shine a little more light upon the situation (Tip #6 from the Basic debugging checklist):

    perl -MO=Deparse 867882.pl my $dir = 'input_files'; die "can't opendir $!" unless opendir DIR, $dir; while (defined(my $file = readdir DIR)) { do { print "The directory and file are $dir/$file\n"; die "Can't open input file $!" unless open IN, "< $dir/$file"; use File::Glob (); while (defined($_ = glob(' IN '))) { print $_; } }; } closedir DIR;

    Here is an explanation from perlop:

    Even <$x > (note the extra space) is treated as glob("$x ") , not readline($x).
      Thanks very much. I am not able to print the contents of the file. I'll read about glob and the explanation in the perlop. Thank you!
      my $dir = 'input_files'; die "can't opendir $!" unless opendir DIR, $dir; while (defined(my $file = readdir DIR)) { do { print "The directory and file are $dir/$file\n"; die "Can't open input file $!" unless open IN, "< $dir/$file"; use File::Glob (); while (defined($_ = glob(' IN '))) { print $_; } }; } closedir DIR;
      Output
      perl create_indivo_schemas.pl Name "main::IN" used only once: possible typo at create_indivo_schemas +.pl line 18. The directory and file are input_files/allergies INThe directory and file are input_files/immunizations INThe directory and file are input_files/.. INThe directory and file are input_files/.
      I got it thanks! This will help with something called Indivo. http://wiki.chip.org/indivo/index.php/Schemas
      while (defined($_ = glob(' IN '))) { print "Hello $_"; } while (<IN>){ print "Bye " . $_ ; }
      I need to read about glob because I don't understand that line. Which print statement has better style? Should I use the . (join) in a print? Thanks, Kevin

        There's no need to concatenate arguments to print as that function takes a list. In fact, you're actually requesting additional processing: do the concatenation and then do the print. Even worse, if you concatenate multiple arguments you're performing multiple operations; in sort of pseudocode: print A.B.C.D performs A.B, then AB.C, then ABC.D, then print ABCD, while print A,B,C,D is one operation.

        As far as style is concerned, I'd say do whatever is easiest to read and maintain. print "X ", $_, "\n"; and print "X $_\n"; are equally valid but, as mentioned above, avoid print "X " . $_ . "\n";.

        I note there's a few places where you haven't included a newline (\n) at the end of your print statement. In case you didn't know, print doesn't automatically add one for you; say, on the other hand, does.

        -- Ken

Re: reading files from a directory
by morgon (Priest) on Oct 28, 2010 at 02:28 UTC
    The problem is not in the open, but in the way you loop through the file.

    You must not have blanks between the filehandle and the angle-operator (because then the angle-operator sees the blanks, finds that this is not a filehandle and tries to glob it.

    You first attempt shoud be ok as long as you do it like this:

    while ( <IN> ){ # no blanks araound IN ... while ( <$fh> ){ # dito
    And while we are at it the second form (using a lexical variable rather than a package-filehandle) is considered to be better style and you should also use the 3-arg form of open:
    open (IN, "<", $dir/$file")
      Thank you. Are you saying that IN is better than $fh?

        The recommendation these days is to use the three argument form of open, lexical filehandles rather than package filehandles (i.e. $fh rather then IN) and to check for the success of the operation, showing the O/S error ($!) in the error message on failure.

        ... my $inputFile = q{/path/to/myFile; open my $inputFH, q{<}, $inputFile or die qq{open: < $inputFile: $!\n}; ...

        I hope this is helpful.

        Cheers,

        JohnGG

Re: reading files from a directory
by kcott (Archbishop) on Oct 28, 2010 at 02:45 UTC

    Many objects have an underlying hash structure. When you print them, you get HASH(0xnnnnnnnn). File handles are based on typeglobs, when you print them you get GLOB(0xnnnnnnnn). This is what you're seeing here. Here's an example:

    $ cat > fh_test This is fh_test $ perl -wE 'open my $fh, q{<}, q{fh_test} or die $!; while (<$fh>) { p +rint }' This is fh_test $ perl -wE 'open my $fh, q{<}, q{fh_test} or die $!; while (< $fh >) { + print }' GLOB(0x10053ad8)

    toolic gave you the fix for IN. I also note you'll need the same fix for $fh. I'd check if you have any other instances of this.

    -- Ken

      Thanks Ken!
Re: reading files from a directory
by umasuresh (Hermit) on Oct 28, 2010 at 15:18 UTC
    Another way to read files from a directory is to use File::Find. I found this to be very useful!