in reply to Re^2: remove element from 2D array after comparing it with other 2D array
in thread remove element from 2D array after comparing it with other 2D array

Ok, I think I understand your algorithm, it seems very similar to the LCSS idea I mentioned and showed, except you want to split words on underscores, and I think you want to compare each word exactly? (In that case, when comparing sbr_ux_side_clkack with ux_side_clk, the match would only be ux_side.)

I could implement your algorithm more or less literally, although in the following I haven't implemented the 50% rule you mentioned - it just picks the @array1 element with the best match, and also there is currently no protection against picking the same @array1 element twice. Instead of spliceing the @array2 elements, which is a destructive operation, I use different offsets. However, as you can tell, the whole thing gets kind of complex, plus, because of the four (!) nested loops, if the strings and/or arrays get longer, it will be less and less performant! But maybe this is a good starting point anyway.

By the way, @array1 and @array2 aren't very descriptive names, I recommend you choose some better (more descriptive) variable names.

use warnings; use strict; use Data::Dump qw/dd pp/; my @array1 = ('ux_prim_clk', 'ux_side_clk', 'ux_xtal_frm_refclk'); my @array2 = ('ccu_ux_xtal_frm_refclk_ack', 'ibbs_ux_prim_clkack', 'sbr_ux_side_clkack'); my @aoa1 = map { [ split /_/, $_ ] } @array1; my @output; # Using @array2 for the basis of ordering, so loop over that first for my $a2 ( map { [ split /_/, $_ ] } @array2 ) { #print "##### "; dd $a2; # debug # Now look through @array1 for the best match my ($highest_match, $highest_match_at_a1idx) = (-1); for my $i ( 0 .. $#aoa1 ) { # in this loop, keep track of index my $a1 = $aoa1[$i]; # The following code relies on @$a1 >= @$a2, so check that if ( @$a2 < @$a1 ) { warn "Skipping ".pp($a2, $a1); next } # Try matching $a1 against $a2 at different offsets for my $offset ( 0 .. @$a2 - @$a1 ) { my $match = 0; # Count the number of matching elements at each offset for my $j ( 0 .. $#$a1 ) { if ( $a2->[$offset+$j] eq $a1->[$j] ) { $match++ } else { last } # failed to match, stop looking } # If this index and offset matches better, record that if ( $match && $match > $highest_match ) { #dd $a2, $offset, $a1, $match; # debug $highest_match = $match; $highest_match_at_a1idx = $i; } } } if ( defined $highest_match_at_a1idx ) { push @output, $array1[$highest_match_at_a1idx] } else { warn "Failed to find match for ".pp($a2) } } dd @array1; dd @array2; dd @output; __END__ ("ux_prim_clk", "ux_side_clk", "ux_xtal_frm_refclk") ( "ccu_ux_xtal_frm_refclk_ack", "ibbs_ux_prim_clkack", "sbr_ux_side_clkack", ) ("ux_xtal_frm_refclk", "ux_prim_clk", "ux_side_clk")

Replies are listed 'Best First'.
Re^4: remove element from 2D array after comparing it with other 2D array
by Newbie95 (Novice) on May 03, 2019 at 09:48 UTC

    Hi Haukex, first of all, thank you so much for your effort.

    I'm sorry but since I'm not very familiar with this language, I find it a bit hard to comprehend this code. May I know what is this part means:

    my ($highest_match, $highest_match_at_a1idx) = (-1);

    You have stated in the code that it will look through @array1 for the best match, but I search the (-1) function in google but they did not gave me information that I can relate with this.

    I also a bit confuse with the function of .pp in your code. Does this store the variable somewhere?

    Really appreciate if you can answer this. Thank you again sir.

      my ($highest_match, $highest_match_at_a1idx) = (-1);

      You might not have spotted it but this is merely list assignment. The list on the right of the equals sign is assigned to the list on the left, so after this assignment, $highest_match has value -1. Since the left list has 2 elements but the right list has only 1, the second element on the left $highest_match_at_a1idx is undef.

      I also a bit confuse with the function of .pp in your code. Does this store the variable somewhere?

      No, it's just a pretty printer. See Data::Dump::pp.

        Okay got it now. Thank you so much for the explanation.