Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

I have two files which differ in only one number and one digit,I am trying to find a condition that matches these two files.How can I do that? </p?

File1:M8BBABONPGM100.dat File2:M8BBABONPYM101.dat if ($file =~ /M8BBABONP\aM\d/) { }

Replies are listed 'Best First'.
Re: Matching condition for two files
by Nikhil Jain (Monk) on May 02, 2011 at 07:34 UTC

    Try something like

    use strict; use warnings; while(<DATA>){ if($_ =~ /M8BBABONP\wM\d+/){ print "Match found\n"; } } __DATA__ M8BBABONPGM100.dat M8BBABONPYM101.dat

    Output:
    Match found
    Match found

      Your regex will also match filenames with more than one digit different.

      Also it will match filenames which will begin with anything else and which have different extensions. To avoid that you will need to anchor the regex:

      /^M8BBABONP\wM10\d\.dat$/

      CountZero

      A program should be light and agile, its subroutines connected like a string of pearls. The spirit and intent of the program should be retained throughout. There should be neither too little or too much, neither needless loops nor useless variables, neither lack of structure nor overwhelming rigidity." - The Tao of Programming, 4.1 - Geoffrey James

        CountZero++. I had tried it and got stuck in some muck, but I used your regex and a foreach:
        #!/usr/bin/perl use strict; use warnings; use Data::Dumper::Concise; my(@array) = ('M8BBABONPGM100.dat', 'M8BBABONPYM101.dat'); foreach my $array(@array) { if ($array =~ /^M8BBABONP\wM10\d\.dat$/){ print "We have a match: \n"; print Dumper($array); } }

      Assuming the differing letter and digit will always be in the same positions...

      The character set  \w also matches digits and _ (underscore). A more specific character set is  [[:alpha:]] (the set of all alphabetic characters), or  [[:upper:]] if you know the letter will always be upper-case. Also, heed the advice of CountZero concerning anchoring the ends of the match and file name extensions.

      See Character Classes and other Special Escapes in perlre, and Posix Character Classes in perlrecharclass.

      why no + after \w
      if($_ =~ /M8BBABONP\w+M\d+/){

        because + used for one or more matching and in this situation we know that only one letter is changing after that letter "M" will come. so for one letter \w is enough.

Re: Matching condition for two files
by Ratazong (Monsignor) on May 02, 2011 at 07:39 UTC
    \a is the escape-sequence for alarm. You probably want to use \w instead (see perlre).
Re: Matching condition for two files
by JavaFan (Canon) on May 02, 2011 at 10:42 UTC
    Untested:
    sub check { my ($name1, $name2) = @_; return unless length($name1) eq length($name2); my $letter_diff = 0; my $number_diff = 0; for (my $i = 0; $i < length($name1); $i++) { my $ch1 = substr $name1, $i, 1; my $ch2 = substr $name2, $i, 1; next if $ch1 eq $ch2; if ($ch1 =~ /\pL/ && $ch2 =~ /\pL/) {return if ++$letter_diff +> 1; next} if ($ch1 =~ /\pN/ && $ch2 =~ /\pN/) {return if ++$number_diff +> 1; next} return; # Differ in non-letter/non-digits } return $letter_diff == 1 && $number_diff == 1; }