Identical Arrays

Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

Replies are listed 'Best First'.
Re: Identical Arrays by davido (Cardinal) on Aug 26, 2012 at 03:57 UTC
This eliminates the sort, instead favoring the efficient operation of constructing a set of unique keys, but assumes elements are unique per array. Essentially we're throwing a little memory at the problem to avoid sorting. As a bonus, we don't even have to care what the elements look like; the hashing handles that for us behind the scenes. `my @a = ( 1, 2, 3 ); my @b = ( 2, 3, 1 ); my @c = ( 1, 2, 4 ); print "(@a) and (@b) are equal sets.\n" if identical( \@a, \@b ); print "(@a) and (@c) are equal sets.\n" if identical( \@a, \@c ); print "(@b) and (@c) are equal sets.\n" if identical( \@b, \@c ); sub identical { my( $left, $right ) = @_; return 0 if scalar @$left != scalar @$right; my %hash; @hash{ @$left, @$right } = (); return scalar keys %hash == scalar @$left; }` [download] This won't do a deep test. It just tests set equality at face value. If your arrays contain complex data structures you'll need to clarify that point. And again, it assumes each array's elements are unique within that array. This first tests if the two arrays have an equal number of elements (if they don't, there's no point continuing). Then a hash is constructed where each key represents a set entry. Both arrays contribute to the keys. If the number of keys is greater or less than the number of elements in either one of the arrays, we know that we don't have full equality (or in terms familiar to set theory, we do have a symmetric difference). Update: If you cannot assume elements are unique per array you're dealing with a "bag" (or multi-set), and this technique won't be reliable. Dealing with a multi-set you would use two hashes, and store a quantity per key. Then you would have to verify both hashes have all the same keys, and in the same quantity. Still not as inefficient as sorting, but does require two passes through the full range of elements. Here's one way to do that: use List::MoreUtils qw/any/; my @a = ( 1, 2, 3 ); my @b = ( 2, 3, 1 ); my @c = ( 1, 2, 4 ); print "(@a) and (@b) are equal sets.\n" if identical( \@a, \@b ); print "(@a) and (@c) are equal sets.\n" if identical( \@a, \@c ); print "(@b) and (@c) are equal sets.\n" if identical( \@b, \@c ); sub identical { my( $left, $right ) = @_; return 0 if scalar @$left != scalar @$right; my( %left, %right ); $left{$_}++ for @$left; $right{$_}++ for @$right; return 0 if any{ ! exists $left{$_} \|\| ! exists $right{$_} \|\| $left{$_} != $right{$_} } keys %left, keys %right; return 1; } [download] And another update: The preceding algorithm could be further optimized. There's no need to keep track of two hashes; we only want to make sure that the net tally between the two hashes is 0 per key. We could do that in a single hash like this: `use List::MoreUtils qw/any/; my @a = ( 1, 2, 3 ); my @b = ( 2, 3, 1 ); my @c = ( 1, 2, 4 ); print "(@a) and (@b) are equal sets.\n" if identical( \@a, \@b ); print "(@a) and (@c) are equal sets.\n" if identical( \@a, \@c ); print "(@b) and (@c) are equal sets.\n" if identical( \@b, \@c ); sub identical { my( $left, $right ) = @_; return 0 if scalar @$left != scalar @$right; my( %bag ); $bag{$_}++ for @$left; $bag{$_}-- for @$right; return 0 if any{ $bag{$_} } keys %bag; return 1; }` [download] And now that we've implemented that, we may as well admit we're trying to solve something that has been solved before (many times): We're looking for a condition where there is no symmetric difference between two bags (multisets). Set::Bag works fairly well: use Set::Bag; my @a = ( 1, 2, 3 ); my @b = ( 2, 3, 1 ); my @c = ( 1, 2, 4 ); print "(@a) and (@b) are equal sets.\n" if identical( \@a, \@b ); print "(@a) and (@c) are equal sets.\n" if identical( \@a, \@c ); print "(@b) and (@c) are equal sets.\n" if identical( \@b, \@c ); sub identical { my ( $left, $right ) = @_; return 0 if scalar @$left != scalar @$right; my $bag_a = Set::Bag->new; my $bag_b = Set::Bag->new; $bag_a->over_delete(1); $bag_b->over_delete(1); $bag_a->insert( $_ => 1 ) for @$left; $bag_b->insert( $_ => 1 ) for @$right; return 1 if $bag_a->difference($bag_b) eq '()' && $bag_b->difference($bag_a) eq '()'; return 0; } [download] My biggest complaint with Set::Bag is that it stringifies the results, so instead of being happy that `$bag_a->difference($bag_b)` returns nothing, we instead have to check that `$bag_a->difference($bag_b)` returns a string that looks like an empty set: `'()'`. ...and the fact that it's essentially an implementation of the two-hash approach, which can be optimized down to a single-hash. Dave	[reply] [d/l] [select]
Re: Identical Arrays by Athanasius (Archbishop) on Aug 26, 2012 at 03:07 UTC
See the FAQ How do I test whether two arrays or hashes are equal?. Update: If the array elements are numeric, this will work: `#! perl use v5.10; use strict; use warnings; my @a = (1, 2, 3); my @b = (2, 3, 1); my @c = (1, 2, 4); printf "The arrays are %s\n", identical(\@a, \@b) ? 'the same' : 'diff +erent'; printf "The arrays are %s\n", identical(\@a, \@c) ? 'the same' : 'diff +erent'; sub identical { my @aa = sort { $a <=> $b } @{ $_[0] }; my @bb = sort { $a <=> $b } @{ $_[1] }; return @aa ~~ @bb; }` [download] Output: `The arrays are the same The arrays are different` [download] Athanasius <°(((>< contra mundum	[reply] [d/l] [select]
Re: Identical Arrays by Kenosis (Priest) on Aug 26, 2012 at 06:33 UTC
Another option is to use Array::Compare, as it can handle alpha, numeric and alphanumeric array data: `use Modern::Perl; use Array::Compare; my @a = ( 1, 2, 3 ); my @b = ( 2, 3, 1 ); my @c = ( 1, 2, 4 ); my @d = qw/ a b c a /; my @e = qw/ a a b c /; my @f = qw/ 1 b c 1 /; my @g = qw/ 1 1 b c /; say 'The arrays are', ArraysIdentical( \@a, \@b ) ? ' ' : ' not ', 'id +entical.'; sub ArraysIdentical { Array::Compare->new()->perm( $_[0], $_[1] ) \|\| 0; }` [download] Output: `The arrays are identical.` [download]	[reply] [d/l] [select]
Re: Identical Arrays by CountZero (Bishop) on Aug 26, 2012 at 07:27 UTC
Easy! `use Modern::Perl; my @one = ( 1, 2, 3 ); my @two = ( 2, 3, 1 ); my @three = ( 1, 2, 4 ); say identical( \@one, \@two ); # returns 1 say identical( \@one, \@three ); # returns 0 sub identical { my ( $first, $second ) = @_; return ((join chr(0), sort @$first) eq (join chr(0), sort @$second +)) ? 1 : 0; }` [download] A few comments: Do not use `&` to call your subroutines. It is not necessary and has side-effects. The parameter list needs references to the arrays, otherwise the subroutine will not be able to distinguish between the two arrays. CountZero A program should be light and agile, its subroutines connected like a string of pearls. The spirit and intent of the program should be retained throughout. There should be neither too little or too much, neither needless loops nor useless variables, neither lack of structure nor overwhelming rigidity." - The Tao of Programming, 4.1 - Geoffrey James My blog: Imperial Deltronics	[reply] [d/l] [select]
Re^2: Identical Arrays by philiprbrenan (Monk) on Aug 26, 2012 at 13:26 UTC
Not so easy! `my @one = (chr(1), chr(1), chr(1)); my @two = (chr(1).chr(0).chr(1), chr(1)); say identical(@one, @two); # returns 1` [download]	[reply] [d/l]
Re^3: Identical Arrays by CountZero (Bishop) on Aug 26, 2012 at 20:28 UTC
I know. This is one of the degenerate cases where a specially crafted string will break this simple subroutine. CountZero A program should be light and agile, its subroutines connected like a string of pearls. The spirit and intent of the program should be retained throughout. There should be neither too little or too much, neither needless loops nor useless variables, neither lack of structure nor overwhelming rigidity." - The Tao of Programming, 4.1 - Geoffrey James My blog: Imperial Deltronics	[reply]
Re: Identical Arrays by Marshall (Canon) on Aug 26, 2012 at 04:28 UTC
The Perl 5.10 "smart match" is a complex critter and what it does for complex data structures is hard to understand. But for simple array's, it works. `#/usr/bin/perl -w use strict; use v5.10.0; my @x = (1, 2, 3); my @y = (2, 3, 1); my @z = (1, 2, 4); print "x and y are equivalent" if sort (@x) ~~ sort (@y); # the sort order (numeric or alpha) doesn't matter # as long as it is consistent. __END__ x and y are equivalent` [download] Oh, don't use either a or b for user Perl variables. In this case, it is ok, but $a and $b are special variables reserved for sort() and other uses. Unlike many other languages, $a is distinct from @a... the same name can be used for different variable types. However, "a" and "b" are so special that I cannot recommend that. Update: I see that I got a "down vote", well ok. Tell us what is wrong about this? I tend to stay away from this "smart match" thing, but it does work and is easy to understand for simple data structures - what it does for more complex structure IS hard to understand. The Perl sort algorithm works just fine for relatively small numbers of items.	[reply] [d/l]
Re^2: Identical Arrays by Anonymous Monk on Aug 26, 2012 at 05:07 UTC
`# the sort order (numeric or alpha) doesn't matter # as long as it is consistent.` [download] `my @x = ('1ringy-dingy2ringy-dingy', 2, 3); my @y = (2, 3, 1);` [download]	[reply] [d/l] [select]
Re^3: Identical Arrays by Marshall (Canon) on Aug 26, 2012 at 05:30 UTC
Well, indeed! I've never used smart match in production code because as this example confirms, it is not as "smart" as one might think! I stand corrected about this. I guess you have to very smart to use the "smart match". Obviously, I am not that smart.	[reply]
Re^4: Identical Arrays by davido (Cardinal) on Aug 26, 2012 at 06:14 UTC
Re: Identical Arrays by BillKSmith (Monsignor) on Aug 27, 2012 at 00:54 UTC
Your arrays appear to represent sets. Consider using diffference method of the module Set::Array. Bill	[reply]