IB2017 has asked for the wisdom of the Perl Monks concerning the following question:

Dear monks

I need to read the files contained in a directory and select them according to their extensions. I'd like to use grep as it seems to be a very fast and concise solution, but I am having problems in matching more than one extention at a time. How can improve the following (it doesn't produce any result, the problem being in the regexp in combination with grep)?

opendir (DIR, "$MyDir") or die "$!"; my @Documents = grep {/\.[docx|pdf]$/} readdir DIR; my $number_of_files = scalar @Documents; print "Total number of files found in folder: $number_of_files +\n"; close DIR;

Replies are listed 'Best First'.
Re: Grep match alternative
by pryrt (Abbot) on Oct 11, 2017 at 20:57 UTC

    See perlre. [] is a character class (perlrecharclass):

    c:\>perl -le "print $_, qq(\t), /\.[docx|pdf]$/ ||0 for qw(.pdf .docx +.p .d .f .d .o .c .x .|);" .pdf 0 .docx 0 .p 1 .d 1 .f 1 .d 1 .o 1 .c 1 .x 1 .| 1

    (?:) is a non-capturing group:

    c:\>perl -le "print $_, qq(\t), /\.(?:docx|pdf)$/ ||0 for qw(.pdf .doc +x .p .d .f .d .o .c .x .|);" .pdf 1 .docx 1 .p 0 .d 0 .f 0 .d 0 .o 0 .c 0 .x 0 .| 0

    edit: fix typo (missing "?:" in my second one-liner)

      Perfect explanation, thank you!

        Your regex is simple enough (i.e., it uses no regex construct introduced after Perl version 5.6) that the hoary YAPE::Regex::Explain can still be informative:

        c:\@Work\Perl\monks>perl -wMstrict -le "use YAPE::Regex::Explain; ;; print YAPE::Regex::Explain->new(qr/\.[docx|pdf]$/)->explain; " The regular expression: (?-imsx:\.[docx|pdf]$) matches as follows: NODE EXPLANATION ---------------------------------------------------------------------- (?-imsx: group, but do not capture (case-sensitive) (with ^ and $ matching normally) (with . not matching \n) (matching whitespace and # normally): ---------------------------------------------------------------------- \. '.' ---------------------------------------------------------------------- [docx|pdf] any character of: 'd', 'o', 'c', 'x', '|', 'p', 'd', 'f' ---------------------------------------------------------------------- $ before an optional \n, and the end of the string ---------------------------------------------------------------------- ) end of grouping ----------------------------------------------------------------------
        See also perlretut, and perlrequick.


        Give a man a fish:  <%-{-{-{-<

Re: Grep match alternative
by karlgoethebier (Abbot) on Oct 12, 2017 at 09:43 UTC
    "... concise solution..."

    Consider this TMTOWTDI:

    #!/usr/bin/env perl use strict; use warnings; use Path::Iterator::Rule; use Data::Dump; use feature qw (say); my $rule = Path::Iterator::Rule->new; $rule->or( $rule->new->name("*.pdf"), $rule->new->name("*.docx") ); my @matches = $rule->all(q(.)); dd \@matches; say scalar @matches; __END__

    N.B.: Written in a hurry and not thoroughly tested, but it should work.

    Regards, Karl

    «The Crux of the Biscuit is the Apostrophe»

    perl -MCrypt::CBC -E 'say Crypt::CBC->new(-key=>'kgb',-cipher=>"Blowfish")->decrypt_hex($ENV{KARL});'Help

Re: Grep match alternative
by Anonymous Monk on Oct 11, 2017 at 20:49 UTC
    try parens instead of brackets.
      Thank you.