in reply to Using an array that contains wildcard characters for pattern matching.

You need to turn file wildcards * (and probably ?) into the equivalent regular expression matches. The following code does that and combines the resulting match strings into a single regular expression. Note that two versions of the pattern are returned to facilitate sorting the match stings in order of most explicit to least explicit. That may not be important, but you should think about the implications.

use strict; use warnings; my $patFile = <<PATS; *.pl.* *\\PL\\* *\\pl-*.fbrb *_pl-*.fbrb *\\*_pl-00.fbrb *\\pl-00.fbrb *\\polish\\* *\\psarc\\polish*.psarc *_POL.* *_POLISH.SUB *_POL_* *_po.xvag *_polish.* *_pl.psarc *_pl2.psarc *_pol.* *_por.* PATS open my $patIn, '<', \$patFile; my @patterns = map {parsePattern($_)} <$patIn>; my $matchStr = join '|', map {$_->[1]} sort {length($b->[0]) <=> length($a->[0])} @patterns; my $regex = qr/($matchStr)/; print "Match regex is '$matchStr'\n"; while (<DATA>) { chomp; print "Matched '$_' on $1\n" if $_ =~ $regex; } sub parsePattern { my ($path) = @_; chomp $path; (my $explicit = $path) =~ tr/*//d; $path =~ s![\\/]![\\\\/]!g; $path =~ s/\./\\./g; $path =~ s/^\*//; $path .= '$' if $path !~ s/\*$//; $path =~ s/\*/.*/g; $path =~ s/\?/./g; return [$explicit, $path]; } __DATA__ c:\Build\PL\Data\test1.dat c:\Build\Data\test1.dat.wibble_POLISH.SUB c:\Build\Data\test1.dat.wibble_POLISH_SUB

Prints:

Match regex is '[\\/]psarc[\\/]polish.*\.psarc$|[\\/].*_pl-00\.fbrb$|[ +\\/]pl-00\.fbrb$|_POLISH\.SUB$|_pl2\.psarc$|[\\/]pl-.*\.fbrb$|_pl-.*\ +.fbrb$|_pl\.psarc$|[\\/]polish[\\/]|_po\.xvag$|_polish\.|_POL\.|_POL_ +|_pol\.|_por\.|\.pl\.|[\\/]PL[\\/]' Matched 'c:\Build\PL\Data\test1.dat' on \PL\ Matched 'c:\Build\Data\test1.dat.wibble_POLISH.SUB' on _POLISH.SUB

As a further generalization both / and \ path separators are accepted.

Perl is the programming world's equivalent of English

Replies are listed 'Best First'.
Re^2: Using an array that contains wildcard characters for pattern matching.
by james28909 (Deacon) on Oct 28, 2014 at 05:52 UTC
    this works great until i add anything like \turkish or \localization and i think it has to do with \t ect, but it can be easily rectified i believe. thanks for the great example as well :)
      for further reference i will attach these pattern files and a slightly modified version of my code using your regex.
      use strict; use warnings; use diagnostics; use File::Slurp; use File::Copy::Recursive qw(rcopy fcopy); use File::Basename; my @dirs; my @patterns; my $dir1 = $ARGV[0]; get_filelist($dir1); get_patterns(); move_files(); mkdir('./ripped'); sub get_filelist { my ($dir) = @_; my ($dh); if ( !opendir( $dh, $dir ) ) { return; } while ( my $file = readdir($dh) ) { next if ( -d $file ); my $path = "$dir/$file"; if ( -d $path ) { get_filelist("$path"); } else { push( @dirs, "$dir/$file" ); } } } sub get_patterns{ my @files = read_dir('patterns'); foreach my $element(@files){ open my $file, '<', "patterns/$element"; while(<$file>){ #$_ =~s/[*]/.*/g; #$_ =~s/\\/\//g; $_ =~ s[\\/][\\\\/]g; $_ =~ s/\\/\\\//g; $_ =~ s/\./\\./g; $_ =~ s/^\*//; $_ .= '$' if $_ !~ s/\*$//; $_ =~ s/\*/.*/g; $_ =~ s/\?/./g; chomp $_; $_ = $_.'|'; push @patterns, $_; } close($file); } } sub move_files{ my $regex = join('', @patterns); print $regex; for my $element(@dirs){ if ($element =~ $regex){ print "Found $element\n"; # rcopy($element, "ripped/$element"); } } }
      here is the compelet pattern set. pattern files I thought i had it working, but it was still missing alot of files, and still copying alot of files it wasnt suppose to and i am unsure on which way to go with it now. i have tried replacing '*' with '.*' and '?' with '.', but it is still failing because it is replacing all instances of * and ?

      i have also found Regexp::Wildcards i will see how that works as well