gw1500se has asked for the wisdom of the Perl Monks concerning the following question:

I am trying to write a script that will parse the files and directories of a given directory. I want to determine the file name, type and size at a minimum. This seems like it should be a rather simple thing but I am having no luck getting anything to work. The built-in function 'readdir' doesn't seem to give me the information I need and I can't seem to get other things like File::Listing to work. Can someone give me a simple command that will return this information given a single file name? I can handle the looping myself. Thanks.

Replies are listed 'Best First'.
Re: Parsing a directory
by pc88mxer (Vicar) on Jun 28, 2008 at 01:37 UTC
    readdir() will give you the file name (leaf name). For the file size you can use the stat() function.

    What do you mean by "file type" - it could mean many things? Do you just need the suffix of the file name, or do you actually want to parse the contents of the file to see if its structure conforms to a list of known file structures. For the latter, have a look at File::MMagic.

    One common trip-up for people using readdir() is that they forget that they need to prepend the parent directory path to obtain a complete path. This is especially true when passing the file name to another function. Here is a good idiom to follow:

    opendir(D, $dir) or die "unable to read $dir: $!" while (defined(my $leaf = readdir(D))) { my $path = "$dir/$leaf"; # perhaps use File::Spec here ... some_function($path); # often passing $leaf here is a mistake ... } closedir(D);
      Just a quick question to verify my thinking with respect to pre-pending the parent directory. I think I found an exception to this rule. If the parent is the root directory (/) then it is a special case because I get //somefile.name. Correct?
        No - that case is no different. The values returned by readdir() will not contain any leading slash. Just try it:
        opendir(D, "/"); for (readdir(D)) { print "got: $_\n" } closedir(D);
Re: Parsing a directory
by kabeldag (Hermit) on Jun 28, 2008 at 01:38 UTC
    Perl Functions has all those cool file test evaluator thingy's listed ;) :
    -r File is readable by effective uid/gid. -w File is writable by effective uid/gid. -x File is executable by effective uid/gid. -o File is owned by effective uid. -R File is readable by real uid/gid. -W File is writable by real uid/gid. -X File is executable by real uid/gid. -O File is owned by real uid. -e File exists. -z File has zero size (is empty). -s File has nonzero size (returns size in bytes). -f File is a plain file. -d File is a directory. -l File is a symbolic link. -p File is a named pipe (FIFO), or Filehandle is a pipe. -S File is a socket. -b File is a block special file. -c File is a character special file. -t Filehandle is opened to a tty. -u File has setuid bit set. -g File has setgid bit set. -k File has sticky bit set. -T File is an ASCII text file (heuristic guess). -B File is a "binary" file (opposite of -T). -M Script start time minus file modification time, in days. -A Same for access time. -C Same for inode change time (Unix, may differ for other platfor +ms)
    my $file = './thefile.sh'; print "File exists and seems to contain data" if (-e $file && -f $file && (-s $file) > 0);
      FWIW, quick pointer to beautified perldoc: -X
Re: Parsing a directory
by jethro (Monsignor) on Jun 28, 2008 at 01:42 UTC
    There are 27 file test operators: For example (-d $file) is true if $file is a directory, (-x $file) is true if the file is executable, (-T $file) is true if $file is a textfile. -s returns the size of the file. They are listed in the perfunc man page, search for '-X FILEHANDLE'. In the Camel book (2'nd edition) it's on page 85

    Also there is the stat function, which returns a 13 element list of information about the file. Again perlfunc has the gory details.

Re: Parsing a directory
by NetWallah (Canon) on Jun 28, 2008 at 04:02 UTC
    I wrote an (undocumented) Directory parsing module recently - it does what you need and probably a lot more.

    Feel free to plagiarize, or use as-is.

    Here is how I have used it...

    my $dir = new Directory(<Path to directory>); ... if ($ShowDir and my $d = $ShowDir->MatchFileReturnDir($title)){ # This file already downloaded ... $ShowDir->Create_Intermediate_Directories_If_Necessary( $_->{URL} ); } $dir->print ({HEADERONLY => 1}); $dir->print({RECURSE=>1, DETAILS=> 0, HEADER=> 0}); # How much Disk ar +e we eating
    The module..

         Have you been high today? I see the nuns are gay! My brother yelled to me...I love you inside Ed - Benny Lava, by Buffalax

Re: Parsing a directory
by gw1500se (Beadle) on Jun 28, 2008 at 22:40 UTC
    Wow! Thanks to all. I'm a bit over whelmed by all the info. Some of the suggestions I've tried and they didn't work but its probably because I don't know quite how to properly structure my variables. It will take me a while to undertand all this. The bottom line is that I am trying to recursively parse the directory to obtain a list of all the files in it and their sizes. When I say I need the file type, I need to mostly identify directories for the recursion and files/links in the tree for my list. Sort of like a 'find' command which I tried first but turned out to be way too slow using shell script.