AnishaM has asked for the wisdom of the Perl Monks concerning the following question:

Hi Perl Monks, I am here again for your help.I have 2 complex data structures that need to be compared with each other.Each and every key/value pair needs to be compared. I made use of the Data::Compare module, but the problem I am unable to resolve is how do I sort these data structures before comparison? Please help me with this. Thanks a lot in advance. Here is my code:
#!/usr/bin/perl use strict; use warnings; use Data::Compare; use Data::Dumper; my @array1 = [ {'platformid' => '22','da' => 'A.9','os' => 'hp-ux-11.31','host' => '2 +060','cc' => 'A.9','ma' => 'A.9','size' => [{'objecttype' => 'FILESYS +TEM','totalsize' => '3628129 KB','application' => '/depot','hostname' + => 'iwf1112060'}],'objecttype' => '2'}, {'platformid' => '100','da' => 'A.9','os' => 'microsoft amd64 wNT-6.1- +S','ma' => 'A.9','cc' => 'A.9','size' => [{'objecttype' => 'OB2BAR',' +totalsize' => '230986 KB','application' => 'IDB','hostname' => '5096' +},{'objecttype' => 'WINFS','totalsize' => '1262152 KB','application' +=> 'R: [New Volume]','hostname' => '5096'},{'objecttype' => 'WINFS',' +totalsize' => '574463 KB','application' => 'C:','hostname' => '5096'} +],'objecttype' => '6','host' => '5096'} ]; my @array2 = [ {'platformid' => '100','da' => 'A.9','os' => 'microsoft amd64 wNT-6.1- +S','ma' => 'A.9','cc' => 'A.9','size' => [{'objecttype' => 'OB2BAR',' +totalsize' => '230986 KB','application' => 'IDB','hostname' => '5096' +},{'objecttype' => 'WINFS','totalsize' => '1262152 KB','application' +=> 'R: [New Volume]','hostname' => '5096'},{'objecttype' => 'WINFS',' +totalsize' => '574463 KB','application' => 'C:','hostname' => '5096'} +],'objecttype' => '6','host' => '5096'}, {'platformid' => '22','da' => 'A.9','os' => 'hp-ux-11.31','host' => '2 +060','cc' => 'A.9','ma' => 'A.9','size' => [{'objecttype' => 'FILESYS +TEM','totalsize' => '3628129 KB','application' => '/depot','hostname' + => '2060'}],'objecttype' => '2'} ]; my @array3 = sort {$a->{platformid} cmp $b->{platformid} or $a->{da} c +mp $b->{da} or $a->{ma} cmp $b->{ma} or $a->{os} cmp $b->{os} or $a-> +{cc} cmp $b->{cc} or $a->{objecttype} cmp $b->{objecttype} or $a->{ho +st} cmp $b->{host} or $a->{size} cmp $b->{size}} @array1; my @array4 = sort {$a->{platformid} cmp $b->{platformid} or $a->{da} +cmp $b->{da} or $a->{ma} cmp $b->{ma} or $a->{os} cmp $b->{os} or $a- +>{cc} cmp $b->{cc} or $a->{objecttype} cmp $b->{objecttype} or $a->{h +ost} cmp $b->{host} or $a->{size} cmp $b->{size}} @array2; my $array1ref = \@array3; my $array2ref = \@array4; # print Dumper $array1ref; # print Dumper $array2ref; my $rc = Compare($array1ref,$array2ref); if($rc == 1) { print "Data structures are equal"; } else { print "Data structures are not equal"; }

Replies are listed 'Best First'.
Re: Compare complex perl data structures
by 1nickt (Canon) on Oct 16, 2016 at 17:13 UTC

    Hi,

    Since your arrays are not in order you may like to look at Test::Deep -- in particular its bag routines.

    Update:

    Here's a solution that works even if you do not know in advance what the elements of your arrays are going to contain. Before placing all the elements of your arrayref into a bag, it recursively goes through the elements and passes each to a subroutine that places any nested arrayrefs into their own sub-bag:

    ( Disclaimer: there may be a more elegant way to do this using some of Test::Deep's, er, deeper methods ... )

    use strict; use warnings; use Test::More; use Test::Deep; use utf8; my ( $v, $w, $x, $y ) = get_data(); my @wanted = map { bagify($_) } @{ $v }; cmp_deeply( $w, bag( @wanted ), 'OP data' ); my @deeper = map { bagify($_) } @{ $x }; cmp_deeply( $y, bag( @deeper ), 'deeper nesting' ); done_testing; sub bagify { my $input = shift; my $output; if ( ref $input eq 'HASH' ) { while ( my ( $key, $val ) = each %{ $input } ) { $output->{ $key } = bagify($val); } } elsif ( ref $input eq 'ARRAY' ) { $output = bag( map { bagify($_) } @{ $input } ); } else { $output = $input; } return $output; } sub get_data { my @v = ( {'platformid' => '100','da' => 'A.9','os' => 'microsoft amd64 wNT- +6.1-S','ma' => 'A.9','cc' => 'A.9','size' => [{'objecttype' => 'OB2BA +R','totalsize' => '230986 KB','application' => 'IDB','hostname' => '5 +096'},{'objecttype' => 'WINFS','totalsize' => '1262152 KB','applicati +on' => 'R: [New Volume]','hostname' => '5096'},{'objecttype' => 'WINF +S','totalsize' => '574463 KB','application' => 'C:','hostname' => '50 +96'}],'objecttype' => '6','host' => '5096'}, {'platformid' => '22','d +a' => 'A.9','os' => 'hp-ux-11.31','host' => '2060','cc' => 'A.9','ma' + => 'A.9','size' => [{'objecttype' => 'FILESYSTEM','totalsize' => '36 +28129 KB','application' => '/depot','hostname' => '2060'}],'objecttyp +e' => '2'} ); my @w = ( {'platformid' => '22','da' => 'A.9','os' => 'hp-ux-11.31','host' = +> '2060','cc' => 'A.9','ma' => 'A.9','size' => [{'objecttype' => 'FIL +ESYSTEM','totalsize' => '3628129 KB','application' => '/depot','hostn +ame' => '2060'}],'objecttype' => '2'}, {'platformid' => '100','da' => + 'A.9','os' => 'microsoft amd64 wNT-6.1-S','ma' => 'A.9','cc' => 'A.9 +','size' => [{'objecttype' => 'OB2BAR','totalsize' => '230986 KB','ap +plication' => 'IDB','hostname' => '5096'},{'objecttype' => 'WINFS','t +otalsize' => '1262152 KB','application' => 'R: [New Volume]','hostnam +e' => '5096'},{'objecttype' => 'WINFS','totalsize' => '574463 KB','ap +plication' => 'C:','hostname' => '5096'}],'objecttype' => '6','host' +=> '5096'} ); my @x = ( { a => { uc => 'A', accented => [qw/á à/] }, b => 'B', c => [ +{ foo => 'bar' }, { baz => [qw/q u x/, { nested => [qw/even more/] }] + } ] }, { d => ['D', [qw/nested array/] ], e => 'E', f => [ { xyz => ' +abc' }, { zzz => 'ZZZ' } ] }, ); my @y = ( { e => 'E', f => [ { zzz => 'ZZZ' }, { xyz => 'abc' } ], d => +[ [qw/array nested/], 'D'] }, { a => { uc => 'A', accented => [qw/à á/] }, c => [ { baz => [ + { nested => [qw/more even/] }, qw/x u q/] }, { foo => 'bar' } ], b = +> 'B' }, ); return ( \@v, \@w, \@x, \@y ); }
    Output:
    perl 1174098-2.pl ok 1 - OP data ok 2 - deeper nesting 1..2

    Update: updated code to remove redundant ref() checks.

    Original reply:

    use strict; use warnings; use Test::More; use Test::Deep; my @x = ( { a => 'A', b => 'B', c => [ { foo => 'bar' }, { baz => 'qux' } ] +}, { d => 'D', e => 'E', f => [ { xyz => 'abc' }, { zzz => 'ZZZ' } ] +}, ); my @y = ( { d => 'D', e => 'E', f => [ { xyz => 'abc' }, { zzz => 'ZZZ' } ] +}, { a => 'A', b => 'B', c => [ { foo => 'bar' }, { baz => 'qux' } ] +}, ); cmp_deeply( \@x, bag(@y) ); done_testing;
    Output:
    $ perl 1174098.pl ok 1 1..1

    If the hash key values are themselves possibly unordered arrays you'll have to figure out from the module documentation what to do about that.

    Hope this helps :-)

    The way forward always starts with a minimal test.
Re: Compare complex perl data structures
by johngg (Canon) on Oct 16, 2016 at 11:44 UTC

    At first glance, your @array1 and @array2 will each be single element arrays containing an array reference because you have used square brackets rather than parentheses. Do either

    my @array1 = ( ... );

    or

    my $refToArray1 = [ ... ];

    I've not looked beyond that but in each case you are trying to sort single elements which is probably not what you want.

    Cheers,

    JohnGG

      Thanks so much for replying Johngg.These arrays were being returned by another function. I got your point. Will try modifying the other function.
Re: Compare complex perl data structures
by BrowserUk (Patriarch) on Oct 16, 2016 at 11:53 UTC

    If all you need to know is are they the same or different -- and not how and where -- then the simplest method I know of is to use a dump routine to convert them to single strings and compare the strings,

    Data::Dump will sort the structures for you as it constructs the strings:

    #! perl -slw use strict; use Data::Dump qw[ pp ]; my @array1 = [ {'platformid' => '22','da' => 'A.9','os' => 'hp-ux-11.31','host' => '2 +060','cc' => 'A.9','ma' => 'A.9','size' => [{'objecttype' => 'FILESYS +TEM','totalsize' => '3628129 KB','application' => '/depot','hostname' + => 'iwf1112060'}],'objecttype' => '2'}, {'platformid' => '100','da' => 'A.9','os' => 'microsoft amd64 wNT-6.1- +S','ma' => 'A.9','cc' => 'A.9','size' => [{'objecttype' => 'OB2BAR',' +totalsize' => '230986 KB','application' => 'IDB','hostname' => '5096' +},{'objecttype' => 'WINFS','totalsize' => '1262152 KB','application' +=> 'R: [New Volume]','hostname' => '5096'},{'objecttype' => 'WINFS',' +totalsize' => '574463 KB','application' => 'C:','hostname' => '5096'} +],'objecttype' => '6','host' => '5096'} ]; my @array2 = [ {'platformid' => '100','da' => 'A.9','os' => 'microsoft amd64 wNT-6.1- +S','ma' => 'A.9','cc' => 'A.9','size' => [{'objecttype' => 'OB2BAR',' +totalsize' => '230986 KB','application' => 'IDB','hostname' => '5096' +},{'objecttype' => 'WINFS','totalsize' => '1262152 KB','application' +=> 'R: [New Volume]','hostname' => '5096'},{'objecttype' => 'WINFS',' +totalsize' => '574463 KB','application' => 'C:','hostname' => '5096'} +],'objecttype' => '6','host' => '5096'}, {'platformid' => '22','da' => 'A.9','os' => 'hp-ux-11.31','host' => '2 +060','cc' => 'A.9','ma' => 'A.9','size' => [{'objecttype' => 'FILESYS +TEM','totalsize' => '3628129 KB','application' => '/depot','hostname' + => '2060'}],'objecttype' => '2'} ]; print 'The data structures are ', pp( \@array1 ) eq pp( \@array2 ) ? ' +the same' : 'different'; __END__ C:\test>1174098.pl The data structures are different

    With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority". I knew I was on the right track :)
    In the absence of evidence, opinion is indistinguishable from prejudice.
      Thanks so much for replying BrowserUk.But, the issue is Dump is not working when both the data structures are not in order. For e.g. Though these two data structures are the same, since they are not sorted,output says that they are different.
      my @array1 = [ {'platformid' => '100','da' => 'A.9','os' => 'microsoft amd64 wNT-6.1- +S','ma' => 'A.9','cc' => 'A.9','size' => [{'objecttype' => 'OB2BAR',' +totalsize' => '230986 KB','application' => 'IDB','hostname' => '5096' +},{'objecttype' => 'WINFS','totalsize' => '1262152 KB','application' +=> 'R: [New Volume]','hostname' => '5096'},{'objecttype' => 'WINFS',' +totalsize' => '574463 KB','application' => 'C:','hostname' => '5096'} +],'objecttype' => '6','host' => '5096'}, {'platformid' => '22','da' => 'A.9','os' => 'hp-ux-11.31','host' => '2 +060','cc' => 'A.9','ma' => 'A.9','size' => [{'objecttype' => 'FILESYS +TEM','totalsize' => '3628129 KB','application' => '/depot','hostname' + => '2060'}],'objecttype' => '2'} ]; my @array2 = [ {'platformid' => '22','da' => 'A.9','os' => 'hp-ux-11.31','host' => '2 +060','cc' => 'A.9','ma' => 'A.9','size' => [{'objecttype' => 'FILESYS +TEM','totalsize' => '3628129 KB','application' => '/depot','hostname' + => '2060'}],'objecttype' => '2'}, {'platformid' => '100','da' => 'A.9','os' => 'microsoft amd64 wNT-6.1- +S','ma' => 'A.9','cc' => 'A.9','size' => [{'objecttype' => 'OB2BAR',' +totalsize' => '230986 KB','application' => 'IDB','hostname' => '5096' +},{'objecttype' => 'WINFS','totalsize' => '1262152 KB','application' +=> 'R: [New Volume]','hostname' => '5096'},{'objecttype' => 'WINFS',' +totalsize' => '574463 KB','application' => 'C:','hostname' => '5096'} +],'objecttype' => '6','host' => '5096'} ];
        As they are, the data structures are not the same, because arrays are ordered collections of items, so that they are different even if they contain the same items, because the item order is different. But if order of the elements is not important to you, why don't you simply sort them before comparing them? Assuming the value of platformid is unique, it'd be quite easy to sort your arrayrefs on that.

        Please also note what johngg told you: your @array1 and @array2 have only one element each: a reference to an array containing two elements (which are themselves hashrefs). I am not sure that's really what you want.

Re: Compare complex perl data structures
by AnomalousMonk (Archbishop) on Oct 16, 2016 at 19:51 UTC

    If each hash referent in an array contains information about a unique platform and if each platform is uniquely identified by, say, a  'platformid' => ... (or some other such) number, it may be enough to sort just by these unique platform identifier numbers:

    my @array1 = ( {'platformid' => '22', ... }, {'platformid' => '100', ... }, ..., ); my @ordered_array1 = sort { $a->{platformid} <=> $b->{platformid} }@ar +ray1; # ascending numeric sort ...; my $rc = Compare(\@ordered_array1, \@ordered_array2); ...;
    Of course, if you have an array like
    my @array1 = ( {'platformid' => '22', 'some' => 'stuff', ... }, {'platformid' => '100', ... }, ..., {'platformid' => '22', 'other' => 'things', ... }, ..., );
    this is not going to work.

    Note also that things like  'platformid' => '22' that seem to be numbers should be numerically compared with the  <=> operator. Lexical (string-wise) comparison of numbers with the  cmp operator will give strange results.

    Another point is that if you really need to use an enormously complex comparison like

    my @array3 = sort { $a->{platformid} cmp $b->{platformid} or $a->{da} cmp $b->{da} or $a->{ma} cmp $b->{ma} or $a->{os} cmp $b->{os} or $a->{cc} cmp $b->{cc} or $a->{objecttype} cmp $b->{objecttype} or $a->{host} cmp $b->{host} or $a->{size} cmp $b->{size} } @array1;
    and use it in more than one place, then this comparison can and should be encapsulated in a sanity-saving function:
    sub enormously_complex_compare { $a->{platformid} <=> $b->{platformid} # numeric? or $a->{da} cmp $b->{da} or $a->{ma} cmp $b->{ma} or $a->{os} cmp $b->{os} or $a->{cc} cmp $b->{cc} or $a->{objecttype} <=> $b->{objecttype} # numeric? or $a->{host} <=> $b->{host} # numeric? or $a->{size} <=> $b->{size} # ??? comparing array references ??? } my @array3 = sort enormously_complex_compare @array1; my @array4 = sort enormously_complex_compare @array2; ...
    Note that the value of the  'size' key is an array reference, and comparing references as numbers (or strings) is problematic: exactly what is the point of this comparison?


    Give a man a fish:  <%-{-{-{-<