ss_ham has asked for the wisdom of the Perl Monks concerning the following question:

hello Perlmonks, new to Perl word! here is the find section in my perl script
#!/usr/bin/env perl <code> my $processed_files = 0; find({ wanted => sub { save_file($File::Find::name) if -f $_ && $_ =~ /auth\.log$/ }, follow => 1 }, @ARGV );
During the find process in above section, how to exclude directory /data/logs/master during search process and also ignore duplicate finds. Is this how we define exclude and what about duplicate finds?
my $processed_files = 0; find( { wanted => sub { save_file($File::Find::name) if -f $_ && $_ =~ /auth\.log$/ && $File::Find::dir ne '/data/logs/master'; }, follow => 0 }, @ARGV );

Replies are listed 'Best First'.
Re: in Perl find call looking to exclude folder and ignore duplicate finds.
by Corion (Patriarch) on Apr 19, 2023 at 08:55 UTC

    See File::Find on $File::Find::prune. I think in your situation, you want to do something like:

    ... if( $File::Find::name eq '/data/logs/master' ) { $File::Find::prune = 1; # stop searching through this } elsif( -f $File::Find::name && $File::Find::name =~ /auth.log$/ +) { save_file($File::Find::name) } else { # ignore the file } ...
      so this how it looks ?
      my $processed_files = 0; find({ wanted => sub { if( $File::Find::name eq '/data/logs/master' ) { $File::Find::prune = 1; # stop searching through this } elsif( -f $File::Find::name && $File::Find::name =~ /auth.l +og$/ ) { save_file($File::Find::name) } else { # ignore the file } follow => 1 }, @ARGV );
        use Cwd qw( abs_path ); my %seen; my $processed_files = 0; find ({ wanted => sub { $File::Find::dir =~ m{^/data/logs/master\b} and return; -f && m/auth\.log$/ or return; $seen{abs_path ($_)}++ and return; save_file ($File::Find::name); $processed_files++; }, follow => 1, }, @ARGV);

        Enjoy, Have FUN! H.Merijn

        I don't know. Does it work for you?

Re: in Perl find call looking to exclude folder and ignore duplicate finds.
by parv (Parson) on Apr 19, 2023 at 17:51 UTC

    What is a "duplicate": same name, same inode, and/or same content?

Re: in Perl find call looking to exclude folder and ignore duplicate finds.
by tybalt89 (Monsignor) on Apr 23, 2023 at 19:23 UTC

    Here's a solution without using File::Find that's pretty simple and I think covers all your requirements (that is, if you are on Linux). TIMTOWTDI !

    #!/usr/bin/perl use strict; # https://perlmonks.org/?node_id=11151755 use warnings; my @paths = @ARGV; my %seen; while( my $path = shift @paths ) { $path eq '/data/logs/master' and next; unshift @paths, grep -d, <$path/*>; -f and $seen{join ' ', (stat)[0,1]}++ == 0 and save_file($_) for <$path/*auth.log>; }
      Here's a solution without using File::Find that's pretty simple and I think covers all your requirements (that is, if you are on Linux) ...

      What is in the code you had posted would make it not work on UNIX or Unix-like OSen?

        I think, in this case "Linux" is meant as pars pro toto for "not Windows"
Re: in Perl find call looking to exclude folder and ignore duplicate finds.
by Anonymous Monk on Apr 19, 2023 at 15:44 UTC

    Eliminating duplicates is not an easy problem, and I know of no portable solution. I believe ack does it by using the stat() built-in to get the inode number, which it uses as the key to a hash. Except under Windows, where it uses the absolute path name computed by Cwd::abs_path().

    The untested Perl code to insert into the find() function would be something like

    state $found = {}; my $key = $^O eq 'MSWin32' ? abs_path( $_ ) : do { my ( $dev, $ino ) = stat( $_ ); "$dev $ino"; }; return if $found->{$key}++;

    Obviously if you need the Windows-specific code you will have to use Cwd 'abs_path'; somewhere.

    But what I'm really hoping is that this post will trigger someone to point out some Perl module that encapsulates all this.