in reply to perl regular expression

Now that code tags have been added to your node so that the code is clear, the problem also is clear. You are using [] in the regex which means match any of the characters within the square brackets. The fix is almost easy - use a non-capture group ((?:...)):

if ($filename =~ /(\.(?:htm|html|txt|pdf|ppt|csv|doc]))\b/i)

Note that the repeat ({3,4}) is not needed.


DWIM is Perl's answer to Gödel

Replies are listed 'Best First'.
Re^2: perl regular expression
by reasonablekeith (Deacon) on Oct 03, 2006 at 14:45 UTC
    I know the OP had the word boundry match at the end, but I don't see its use here. Neither do I see the need for a non capturing group. Either way, the OP should probably positively match for the end of the file name, otherwise one might end up passing a file by virtue of a match part way through the filename.

    The following wont print match...

    my $filename = "/home/root/this_is_a_valid_name.html.gz"; if ($filename =~ m/(\.(htm|html|txt|pdf|ppt|csv|doc))$/i) { print "match\n"; # this won't print }
    ... however swap in the this one, and is passes just fine...
    my $filename = "/home/root/this_is_a_valid_name.html.gz"; if ($filename =~ /(\.(?:htm|html|txt|pdf|ppt|csv|doc]))\b/i) { print "match\n"; # this will print }
    ---
    my name's not Keith, and I'm not reasonable.

      What happens when the filename is (arbitrarily contrived, but): magical.html_parser.dll?

      Though I suppose it matters what the OP wants to do with the file(s), and whether or not a .gz'd html/txt/etc file is desired.



      --chargrill
      s**lil*; $*=join'',sort split q**; s;.*;grr; &&s+(.(.)).+$2$1+; $; = qq-$_-;s,.*,ahc,;$,.=chop for split q,,,reverse;print for($,,$;,$*,$/)