in reply to Re: File::Find pattern match question
in thread File:Find pattern match question

Thanks for the reply :). Helps to ask a question... I'd like the script to only report the directory name once, not print it again depending on how many sub directories it finds under the pattern matched directory. I'll try your example - cheers. We do have some duplicate directories so finding dupes is the next problem.

Replies are listed 'Best First'.
Re^3: File:Find pattern match question
by kcott (Archbishop) on Oct 31, 2013 at 06:58 UTC
    "We do have some duplicate directories so finding dupes is the next problem."

    If you change this line in ++Athanasius' code:

    $dirs{$File::Find::dir} = 1;

    to

    ++$dirs{$File::Find::dir};

    The code will run the same but now you'll have a count. You can then find duplicates like this (untested):

    my @dup_dirs = grep { $dirs{$_} > 1 } keys %dirs;

    -- Ken

      Hi Ken I tried your example but it wouldn't output anything. If I removed the pattern match string it does work but I'm not sure what it is printing out - the directory listing seems random.

      #!/usr/bin/perl # dirpathdupes use strict; use warnings; use File::Find; use Fcntl; #*****************Path Variables********************** our $wellpath = 'N:\\repos\\open\\Wells\\Regulated\\'; our $surveypath = 'N:\\repos\\open\\Surveys\\Regulated\\'; our $testpath = 'C:\\Temp\\'; #******************************************************* my %dirs; find(\&dir_names, $testpath); my @dup_dirs = grep { $dirs{$_} > 1 } keys %dirs; #print "$_\n" for sort keys %dirs; foreach my $l (@dup_dirs) { print "$l\n"; } sub dir_names { # skip over everything that is not a directory return unless -d $File::Find::name; # skip over directories that don't match required pattern return unless $File::Find::dir =~ /[IPD]\d{8}$/; ++$dirs{$File::Find::dir}; }

        "Hi Ken I tried your example but it wouldn't output anything."

        The technique I showed should work fine. In his response below, Athanasius has highlighted the issue with your original premise (i.e. parent vs. current directory). You can still use the technique I provided, you'll just need to work it into the code fix he's shown.

        "If I removed the pattern match string it does work but I'm not sure what it is printing out - the directory listing seems random."

        Hashes are unordered: keys %hash_name will return a list of keys in an apparently random order. If you're interested, see the "Hash Algorithm" section of "perlsec: Algorithmic Complexity Attacks" for more details.

        sort may provide the ordering you want. If not, you may want to consider an array, or perhaps a more complex data structure, instead of a hash, to store your data. See "perldsc - Perl Data Structures Cookbook".

        -- Ken