arunmep has asked for the wisdom of the Perl Monks concerning the following question:

Hi guys

I need to bypass a set of file with certain extension. Iam having those extension in a String file.

$exten=|\.txt|\.doc|\.xml...

I am trying to do the following

if($filename=~/$exten/i) { print "Not bypassed"; }

but $exten is taken as a String variable rather than a regular expression how can i solve this problem.

Replies are listed 'Best First'.
Re: how to get the String as regularexpression
by davorg (Chancellor) on Oct 03, 2006 at 14:33 UTC

    Firstly, please cut and paste your code into your questions here, don't try to retype them as that can create bugs. I assume you retyped your code here as:

    $exten=|\.txt|\.doc|\.xml

    doesn't even compile.

    Let's assume that you actually have:

    $exten = '|\.txt|\.doc|\.xml';

    Then the problem is not with the interpolation of the string in the match operator (as you think), it's actually a problem with your regular expression. See the following (which is based on your code):

    my $exten = '|\.txt|\.doc|\.xml'; for (qw(foo.txt foo.csv foo.xml)) { print "$_: "; print /$exten/ ? "match" : "no match"; print "\n"; }

    This gives the following output:

    foo.txt: match foo.csv: match foo.xml: match

    See that everything matches, even foo.csv which looks like it shouldn't match.

    Now compare with this:

    my $exten = '\.txt|\.doc|\.xml'; for (qw(foo.txt foo.csv foo.xml)) { print "$_: "; print /$exten/ ? "match" : "no match"; print "\n"; }

    Which gives this output:

    foo.txt: match foo.csv: no match foo.xml: match

    The difference is that the first version has a '|' at the start of the regex. And that means that the alternatives will included the empty string. And everything matches the empty string.

    So, to summarise, there's nothing wrong with using a variable as a regular expression as you have done, but you need to get the regex right :-)

    --
    <http://dave.org.uk>

    "The first rule of Perl club is you do not talk about Perl club."
    -- Chip Salzenberg

Re: how to get the String as regularexpression
by reneeb (Chaplain) on Oct 03, 2006 at 14:20 UTC
    Have a look at the qr-operator

    sample:
    #!/usr/bin/perl use strict; use warnings; my $extensions = qr/\.doc|\.gif/; my @files = qw(test.doc test.gif test.png test.jpg); for(@files){ if($_ =~ $extensions){ print "yes\n"; } }

      qr// doesn't change the way that the string is interpolated into a regex - it simply precompiles the string into a regex (see the documentation).

      Using qr// with the code from the original post won't fix the problem. You have fixed the problem by dropping the spurious '|' from the start of the regex.

      --
      <http://dave.org.uk>

      "The first rule of Perl club is you do not talk about Perl club."
      -- Chip Salzenberg

Re: how to get the String as regularexpression
by ptum (Priest) on Oct 03, 2006 at 14:24 UTC

    This doesn't really answer your question, but you may want to check out File::Find which addresses this problem space.

    I'm not sure what problem you are having with the use of a scalar as a regex -- the following code snippet seems to work for me:

    use strict; use warnings; my $exten = "(\\\.txt|\\\.csv)"; my @files = ('one.txt', 'two.html', 'three.csv', 'four.log'); foreach (@files) { if (/$exten/i) { print $_, "\n"; } }

    ... and the output:

    one.txt three.csv

    Update: I concur with prasadbabu below, in that the judicious use of anchors can help your regex be more faithful to your intentions. Also, as davorg has noted, the main problem with your regex was the leading '|' character (which I assumed was a typo). As you probably know, the pipe character serves the function of 'or' within a regex.

    Update 2: Fixed problem with backslashes and case insensitivity, as noted by ikegami. I wasn't sure about the OP's intention, so left out the anchor deliberately.


    No good deed goes unpunished. -- (attributed to) Oscar Wilde

      Your code will also match a file named pretxt.
      my $exten = "(\.txt|\.csv)";
      should be
      my $exten = "(\\\.txt|\\\.csv)";  # (\.txt|\.csv)

      There's no reason to force the user to specify the parens.
      Why did you remove the case-insensitivity?
      You also forgot to anchor the regexp.

      use strict; use warnings; my $exten = "\\\.txt|\\\.csv"; # \.txt|\.csv my @files = qw( one.txt two.html three.csv four.log pretxt file.txt~ File.Txt ); foreach (@files) { if (/($exten)\z/i) { print $_, "\n"; } }

      Update: Or if you want to build $exten:

      my @exten = qw( .txt .csv ); my $exten = join('|', map quotemeta, @exten);
Re: how to get the String as regularexpression
by prasadbabu (Prior) on Oct 03, 2006 at 14:31 UTC

    Already reneeb++ and ptum++ has given solutions to solve your problem.

    In addition to that, if you are getting the list of files from directory, you can use '$' anchor at the end in the matching regex ($exten) to avoid files matching '.xml.bak', '.doc.bak' etc. I think this ll help you.

    use strict; use warnings; my $exten = '(\.txt|\.csv)$'; my @files = ('one.txt.bak', 'two.html', 'three.csv', 'four.log', 'five +.txt'); foreach (@files) { if (/$exten/) { print $_, "\n"; } } prints: three.csv five.txt

    This won't match 'one.txt.bak'.

    updated: added code

    Prasad

Re: how to get the String as regularexpression
by planetscape (Chancellor) on Oct 03, 2006 at 19:21 UTC