Ksonar has asked for the wisdom of the Perl Monks concerning the following question:

Hello all... this is my first post on PerlMonks. So please mind if any mistakes. I'm trying to match two arrays, the number of elements is different (that is not the problem). The issue is where the elements are technically not similar as all can see but I want to match them.

I have tried the below method alongwith this operator "~~" but I think some type of pattern matching method is required. I am not able to get it. Please let me know if any solution...

@array1=('adam west', 'daric dalon','tom helic','todd nick','riley rem +er'); @array2=('adam west 12', 'daric dalon mr.','tom helic (fads)','todd ni +ck (456)','riley remer','john steve','dim madz 12'); $array1Size=@array1; $array2Size=@array2; $count = 0; for ( $j = 0 ; $j < $array1Size ; $j++ ) { for ( $k = 0 ; $k < $array2Size ; $k++ ) { if ( $array1[$j] =~ $array2[$k] ) { do something; } else { do something; } } }
  • Comment on Matching arrays with different number of elements and element size
  • Download Code

Replies are listed 'Best First'.
Re: Matching arrays with different number of elements and element size
by tybalt89 (Monsignor) on Jul 13, 2017 at 15:33 UTC

    Whee! A chance to learn something new :) It's the first time I used "key generation functions" in Algorithm::Diff.

    #!/usr/bin/perl # http://perlmonks.org/?node_id=1195036 use strict; use warnings; use Algorithm::Diff qw(traverse_sequences); my @array1=('adam west', 'daric dalon','tom helic','todd nick', 'riley remer'); my @array2=('adam west 12', 'daric dalon mr.','tom helic (fads)', 'todd nick (456)','riley remer','john steve','dim madz 12'); # generate a hash for the key generation function $_ = join '', map "$_\n", sort @array1, @array2; my %keyhash; $keyhash{$1} = $keyhash{$2} = $1 while /^(.*)\n(?=(\1.*)\n)/gm; # match traverse_sequences( \@array1, \@array2, { MATCH => sub {print " matched $array1[shift()] -- $array2[pop() +]\n"}, DISCARD_A => sub {print "unmatched $array1[shift()]\n"}, DISCARD_B => sub {print "unmatched $array2[pop()]\n"}, }, sub { $keyhash{$_[0]} // $_[0] }, );

    This prints:

    matched adam west -- adam west 12 matched daric dalon -- daric dalon mr. matched tom helic -- tom helic (fads) matched todd nick -- todd nick (456) matched riley remer -- riley remer unmatched john steve unmatched dim madz 12

    Is this what you were looking for?

      Hi tybalt89,

      Thank you very much!!! This is exactly what I wanted, just that I am very amateur to Perl, I will start understanding how you have done it.

      Keep it up!

Re: Matching arrays with different number of elements and element size
by Eily (Monsignor) on Jul 13, 2017 at 14:37 UTC

    What can be the differences between the two arrays? Are there only additional elements in the second, or can there be missing ones as well? Are all elements of the first array substrings of the second as is the case in your example?

    If the strings in array1 are substrings of array2, you can use index which lets you find if a string is found in another (easier to use than pattern matching), and the built-in grep can let you select elements from a list (untested):

    my @output; for my $name (@array1) { print "$name :" join ", ", grep { index $_, $name >= 0 } @array2 ; print "\n"; }

Re: Matching arrays with different number of elements and element size
by thanos1983 (Parson) on Jul 13, 2017 at 14:38 UTC

    Hello Ksonar,

    Welcome to the Monastery. Check this question (Difference between two arrays - is there a better way?), it contains all the answers to your problem.

    Update: A few comments regarding the sample code that you provided.

    When you compare if ( $array1[$j] =~ $array2[$k] ) you are not comparing the strings read perlop/DESCRIPTION. You should use eq for strings and == for integers. This is a binding operators =~ read more here perlop/Binding Operators.

    One final note:

    In Perl you do not need to loop arrays in C style for ( $j = 0 ; $j < $array1Size ; $j++ ) you can loop them through foreach my $element (@array) {} read more here perlop/Foreach Loops. In case you want to iterate over two equal size arrays you can do it like this foreach my $index ( 0 .. $#array ){} but in your case your arrays are not same in size so you can not use this method just for reference, read more here How do I Loop over multiple arrays simultaneously ?.

    Hope this helps, BR.

    Seeking for Perl wisdom...on the process of learning...not there...yet!

      Hello thanos1983. One precision on this:

      When you compare if ( $array1[$j] =~ $array2[$k] ) you are not comparing the strings read perlop/DESCRIPTION. You should use eq for strings and == for integers.
      It's mostly true, if you consider "compare" to only mean check that two strings are equal. As you said, the =~ will bind the left operand to the right, which will be interpreted as a regex (so this is the same as $array1[$j] =~ m/$array2[$k]/ if $array2[$k] is a string). When the string in the regex does not have meta characters (as is the case for all strings in the first array), this will actually check that the first string contains the second. Checking for inclusion may be considered one flavor of "comparing" two strings (and the test would be true when the strings are equal).

      Do note that the strings in the second array are longer than the ones in the first, so eq won't work. $array2[$k] =~ /$array1[$j]/ might work as it checks that the second string contains the first (notice that I have inverted the two), except if the string in @array1 contains metacharacters (in which case it can either give the wrong result, or just die). That's why my advice is to use index instead.

        Hello Eily,

        You are absolutely right. I did not notice that both arrays have different strings this is why I proposed eq but you are right he should using kind of a regex solution or grep.

        Thanks, it is nice to point out minor mistakes to avoid confusion for OP.

        Seeking for Perl wisdom...on the process of learning...not there...yet!
Re: Matching arrays with different number of elements and element size
by 1nickt (Canon) on Jul 13, 2017 at 14:53 UTC

    Hi, try the function all in the core module List::Util.

    use strict; use warnings; use feature 'say'; use List::Util 'any'; my @wanted = ( 'foo', 'bar', 'baz', 'missing' ); my @source = ( 'foo 42', '666 bar', 'baz (qux)' ); for my $wanted ( @wanted ) { if ( any { /$wanted/ } @source ) { say "Found $wanted"; } else { say "Cannot find $wanted"; } }
    Output:
    $ perl 1195036.pl Found foo Found bar Found baz Cannot find missing


    The way forward always starts with a minimal test.

      Hi 1nickt,

      thanks for the help. I have a different dataset on which I am trying this on but I am not getting what I wanted but I will modify this and use it.

Re: Matching arrays with different number of elements and element size
by haukex (Archbishop) on Jul 13, 2017 at 18:04 UTC

    Depending on whether regex matching is appropriate for what you are trying to do, you might want to check out my tutorial Building Regex Alternations Dynamically.

    use warnings; use strict; use Data::Dump qw/pp/; my @needles = ('adam west', 'daric dalon','tom helic','todd nick', 'riley remer'); my @haystack = ('adam west 12', 'daric dalon mr.','tom helic (fads)', 'todd nick (456)','riley remer','john steve','dim madz 12'); my ($needle_regex) = map { qr/$_/i } join '|', map {quotemeta} sort { length $b <=> length $a or $a cmp $b } @needles; for my $str (@haystack) { if ( my ($match) = $str=~/($needle_regex)/ ) { print pp($str)," matches on ",pp($match),"\n" } else { print pp($str)," doesn't match\n" } } __END__ "adam west 12" matches on "adam west" "daric dalon mr." matches on "daric dalon" "tom helic (fads)" matches on "tom helic" "todd nick (456)" matches on "todd nick" "riley remer" matches on "riley remer" "john steve" doesn't match "dim madz 12" doesn't match