in reply to Using an array that contains wildcard characters for pattern matching.
You need to turn file wildcards * (and probably ?) into the equivalent regular expression matches. The following code does that and combines the resulting match strings into a single regular expression. Note that two versions of the pattern are returned to facilitate sorting the match stings in order of most explicit to least explicit. That may not be important, but you should think about the implications.
use strict; use warnings; my $patFile = <<PATS; *.pl.* *\\PL\\* *\\pl-*.fbrb *_pl-*.fbrb *\\*_pl-00.fbrb *\\pl-00.fbrb *\\polish\\* *\\psarc\\polish*.psarc *_POL.* *_POLISH.SUB *_POL_* *_po.xvag *_polish.* *_pl.psarc *_pl2.psarc *_pol.* *_por.* PATS open my $patIn, '<', \$patFile; my @patterns = map {parsePattern($_)} <$patIn>; my $matchStr = join '|', map {$_->[1]} sort {length($b->[0]) <=> length($a->[0])} @patterns; my $regex = qr/($matchStr)/; print "Match regex is '$matchStr'\n"; while (<DATA>) { chomp; print "Matched '$_' on $1\n" if $_ =~ $regex; } sub parsePattern { my ($path) = @_; chomp $path; (my $explicit = $path) =~ tr/*//d; $path =~ s![\\/]![\\\\/]!g; $path =~ s/\./\\./g; $path =~ s/^\*//; $path .= '$' if $path !~ s/\*$//; $path =~ s/\*/.*/g; $path =~ s/\?/./g; return [$explicit, $path]; } __DATA__ c:\Build\PL\Data\test1.dat c:\Build\Data\test1.dat.wibble_POLISH.SUB c:\Build\Data\test1.dat.wibble_POLISH_SUB
Prints:
Match regex is '[\\/]psarc[\\/]polish.*\.psarc$|[\\/].*_pl-00\.fbrb$|[ +\\/]pl-00\.fbrb$|_POLISH\.SUB$|_pl2\.psarc$|[\\/]pl-.*\.fbrb$|_pl-.*\ +.fbrb$|_pl\.psarc$|[\\/]polish[\\/]|_po\.xvag$|_polish\.|_POL\.|_POL_ +|_pol\.|_por\.|\.pl\.|[\\/]PL[\\/]' Matched 'c:\Build\PL\Data\test1.dat' on \PL\ Matched 'c:\Build\Data\test1.dat.wibble_POLISH.SUB' on _POLISH.SUB
As a further generalization both / and \ path separators are accepted.
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re^2: Using an array that contains wildcard characters for pattern matching.
by james28909 (Deacon) on Oct 28, 2014 at 05:52 UTC | |
by james28909 (Deacon) on Oct 28, 2014 at 17:29 UTC |