billie_t has asked for the wisdom of the Perl Monks concerning the following question:

Almost-newbie question: I'm trying to parse a directory, with many subdirectories, which contain files with names like 2.10, 2.22, 3.11 etc. I want to process all the files in all the subdirectories that begin with "1." first, then the ones with beginning with "2." etc.

I'm trying to use File:Find with a counter beginning at 1. If no file is found with that name, $_ is presumably undefined, so I want it to increment the counter and return to look for files beginning with the next number. That's where the (a?) problem occurs (if undef is "fake code" since I can't figure out how to express it).
use File::Find; my $name = 1; find(\&Wanted, $dir); sub Wanted { /$name\.*/; if undef { $name++; return; } else { open(FILE, $_); my @lines = <FILE>; } }

Replies are listed 'Best First'.
Re: Processing directories (again) with File::Find
by submersible_toaster (Chaplain) on Jun 16, 2003 at 07:48 UTC

    To get the order of files correct, I think you would be better off using the preprocess method of File::Find. Applying a sort on the contents of a subdir, before the call to \&Wanted.

    From the POD
    preprocess
    The value should be a code reference. This code reference is used to preprocess the current directory. The name of currently processed directory is in $File::Find::dir. Your preprocessing function is called after readdir() but before the loop that calls the wanted() function. It is called with a list of strings (actually file/directory names) and is expected to return a list of strings. The code can be used to sort the file/directory names alphabetically, numerically, or to filter out directory entries based on their name alone. When follow or follow_fast are in effect, preprocess is a no-op.
    use File::File; find( { wanted=>\&Wanted , preprocess=>\&PreProcess , } $dir ); sub PreProcess { return sort { $a cmp $b } @_; } sub Wanted { if ( $_ =~ /\d+\.\d+/ ) { open ( FILE , $_ ) or die "Screaming $!"; my @lines = <FILE>; doStuffwith(@lines); } else { return } }

    Tuning the sort in PreProcess will allow you to work on files in the order that you want. I suspect you'd want to filter out '.' and '..' and any other undesirable directories so that Wanted is only working on ... the files you want!


    Code is untested.
    I can't believe it's not psellchecked
Re: Processing directories (again) with File::Find
by Skeeve (Parson) on Jun 16, 2003 at 06:23 UTC
    I think you'll have to loop
    use File::Find; my $limit= 4; # Whatever you prefer for ($name = 1; $name <= $limit; ++$name) { find(\&Wanted, $dir); }
    This will go through all subdirectories and search 1.* first, then 2.*, then 3.* and at last 4.* or whatever you defined as a $limit.
    sub Wanted { # /$name\.*/; # NO! You want $name at the beginning!
    Update... Made a mistake         /^$name\./;
    if (/^$name\./) { # if undef {
    If what is undef?
    What you pretend to do is, increment $name if $name is not in the directory. What you really would do if it worked is: increment $name if the FIRST NAME in the directory doesn't match. It won't match since the first name would (in *NIX) almost always be "."
    # $name++; # return; # } else { open(FILE, $_); my @lines = <FILE>; # some processing missing here? } }
      Thanks for the suggestions, Skeeve. Unfortunately, everything is still seeming to be processed multiple times in no particular order.

      One VERY IMPORTANT thing I forgot to mention - and which is probably causing problems as well as my bad syntax etc -is the fact that the directories have names that also consist of numbers. I thought that specifying /^$name\.*/ would limit it to filenames like 1.xx? (assuming a file structure of something like /03/4/1.32). I'm doing this on Windows, by the way

      And yes, there is some processing after the file is read into an array (squirting it into a spreadsheet), but this part is the problematic bit...
        My mistake. It should have been:
        if (/^$name\./) { open....
Re: Processing directories (again) with File::Find
by TomDLux (Vicar) on Jun 16, 2003 at 18:36 UTC

    Don't forget you can specify an array of starting directories:

    find( \&Wanted, 1..9);

    and then set $File::Find::prune=1 in Wanted() when you've gone as deep as you want into a directory.

    --
    TTTATCGGTCGTTATATAGATGTTTGCA