in reply to File:Find pattern match question

Hello RockE,

As you don’t actually ask a question, I’ll have to guess that you want a way to remove duplicate directories from your output. Here is one approach:

... my %dirs; find(\&dir_names, $wellpath); print "$_\n" for sort keys %dirs; sub dir_names { # skip over everything that is not a directory return unless -d $File::Find::name; # skip over directories that don't match required pattern return unless $File::Find::dir =~ /[IPD]\d{8}$/; $dirs{$File::Find::dir} = 1; }

That is, instead of printing each directory as it is found, store it in a hash and print the hash keys after the call to find has completed. As hash keys are necessarily unique, no duplicates will be recorded.

Hope that helps,

Update: See the correction below.

Athanasius <°(((><contra mundum Iustus alius egestas vitae, eros Piratica,

Replies are listed 'Best First'.
Re^2: File:Find pattern match question
by RockE (Novice) on Oct 31, 2013 at 06:17 UTC

    Thanks for the reply :). Helps to ask a question... I'd like the script to only report the directory name once, not print it again depending on how many sub directories it finds under the pattern matched directory. I'll try your example - cheers. We do have some duplicate directories so finding dupes is the next problem.

      "We do have some duplicate directories so finding dupes is the next problem."

      If you change this line in ++Athanasius' code:

      $dirs{$File::Find::dir} = 1;

      to

      ++$dirs{$File::Find::dir};

      The code will run the same but now you'll have a count. You can then find duplicates like this (untested):

      my @dup_dirs = grep { $dirs{$_} > 1 } keys %dirs;

      -- Ken

        Hi Ken I tried your example but it wouldn't output anything. If I removed the pattern match string it does work but I'm not sure what it is printing out - the directory listing seems random.

        #!/usr/bin/perl # dirpathdupes use strict; use warnings; use File::Find; use Fcntl; #*****************Path Variables********************** our $wellpath = 'N:\\repos\\open\\Wells\\Regulated\\'; our $surveypath = 'N:\\repos\\open\\Surveys\\Regulated\\'; our $testpath = 'C:\\Temp\\'; #******************************************************* my %dirs; find(\&dir_names, $testpath); my @dup_dirs = grep { $dirs{$_} > 1 } keys %dirs; #print "$_\n" for sort keys %dirs; foreach my $l (@dup_dirs) { print "$l\n"; } sub dir_names { # skip over everything that is not a directory return unless -d $File::Find::name; # skip over directories that don't match required pattern return unless $File::Find::dir =~ /[IPD]\d{8}$/; ++$dirs{$File::Find::dir}; }

Re^2: File:Find pattern match question
by RockE (Novice) on Nov 01, 2013 at 00:50 UTC

    ok tried your example and it doesn't print anything out, if I remove the pattern matching requirement it does work but obviously shows me all directories

    #!/usr/bin/perl # dirpath use strict; use warnings; use File::Find; use Fcntl; #*****************Path Variables********************** our $wellpath = 'N:\\repos\\open\\Wells\\Regulated\\'; our $surveypath = 'N:\\repos\\open\\Surveys\\Regulated\\'; our $testpath = 'C:\\Temp\\'; #******************************************************* my %dirs; find(\&dir_names, $testpath); print "$_\n" for sort keys %dirs; sub dir_names { # skip over everything that is not a directory return unless -d $File::Find::name; # skip over directories that don't match required pattern return unless $File::Find::dir =~ /[IPD]\d{8}$/; $dirs{$File::Find::dir} = 1; }

      According the the documentation for File::Find:

      The wanted function takes no arguments but rather does its work through a collection of variables.

          $File::Find::dir is the current directory name,
          $_ is the current filename within that directory
          $File::Find::name is the complete pathname to the file.

      The above variables have all been localized and may be changed without affecting data outside of the wanted function.

      So this line:

      return unless $File::Find::dir =~ /[IPD]\d{8}$/;

      is actually testing the parent directory, not the current file. Better to run both tests against the current filename in $_:

      sub dir_names { return unless -d $_; return unless $_ =~ /[IPD]\d{8}$/; $dirs{$File::Find::name} = 1; }

      or just:

      sub dir_names { return unless -d; return unless /[IPD]\d{8}$/; $dirs{$File::Find::name} = 1; }

      I think that fixes the problem.

      Hope that helps,

      Athanasius <°(((><contra mundum Iustus alius egestas vitae, eros Piratica,

        Works perfectly! :-) Thank you very much! Cheers

Re^2: File:Find pattern match question
by Anonymous Monk on Mar 24, 2016 at 02:22 UTC
    How do I fix the code to only list the directory once. Replace the period with a question mark (?) and you'll see he did have a question. Idiot.

      How do I fix the code to only list the directory once. Replace the period with a question mark (?) and you'll see he did have a question. Idiot.

      You read all the replies and find the one which answered the question years ago

      If you want somebody to point it out for you, don't be rude