Re: Eliminate exact duplicates from array of hashes

A serialization approach:-

use 5.026;
use warnings;

use Data::Dumper;

my @test_data = (
   { Tag1 => q{1}, Tag2 => q{a} },
   { Tag1 => q{1}, Tag2 => q{a} },
   { Tag1 => q{1}, Tag2 => q{b} },
   { Tag1 => q{1}, Tag2 => q{c} },
   { Tag1 => q{1}, Tag2 => q{a} },
   { Tag1 => q{2}, Tag2 => q{a} },
   { Tag1 => q{2}, Tag2 => q{d} },
   { Tag1 => q{2}, Tag2 => q{a} },
   { Tag1 => q{3} },
   { Tag1 => q{sun}, Tag2 => q{a} },
   { Tag1 => q{sun}, Tag2 => q{a} },
   );

my @unique = do {
   my %seen;
   map  { $_->[ 1 ] }
   grep { ! $seen{ $_->[ 0 ] } ++ }
   map  {
        my $rhItem = $_;
        [
           (
              join qq{\x00},
                 map { join qq{\x00}, $_, $rhItem->{ $_ } }
                 sort keys %{ $rhItem }
           ),
           $rhItem
        ]
        }
   @test_data;
   };

print Data::Dumper
   ->new( [ \ @unique ], [ qw{ *unique } ] )
   ->Sortkeys( 1 )
   ->Dumpxs();
[download]

The output:-

@unique = (
            {
              'Tag1' => '1',
              'Tag2' => 'a'
            },
            {
              'Tag1' => '1',
              'Tag2' => 'b'
            },
            {
              'Tag1' => '1',
              'Tag2' => 'c'
            },
            {
              'Tag1' => '2',
              'Tag2' => 'a'
            },
            {
              'Tag1' => '2',
              'Tag2' => 'd'
            },
            {
              'Tag1' => '3'
            },
            {
              'Tag1' => 'sun',
              'Tag2' => 'a'
            }
          );
[download]

I hope this is of interest.

Cheers,

JohnGG

Comment on Re: Eliminate exact duplicates from array of hashes Select or Download Code

Replies are listed 'Best First'.
Re^2: Eliminate exact duplicates from array of hashes by NetWallah (Canon) on Oct 10, 2019 at 01:31 UTC
This algorithm suffers from the same issue that LanX pointed out. The following data defeats the de-dup: `my $null = qq{\x00}; my @test_data = ( { "a${null}1${null}b"=>"2" }, { a => 1, b => 2} );` [download] My second program gives the correct results, although it too could be defeated by sufficiently crafted data. "From there to here, from here to there, funny things are everywhere." -- Dr. Seuss	[reply] [d/l]

Replies are listed 'Best First'.

Re^2: Eliminate exact duplicates from array of hashes
by NetWallah (Canon) on Oct 10, 2019 at 01:31 UTC

LanX

The following data defeats the de-dup:

my $null = qq{\x00};
my @test_data = (
   { "a${null}1${null}b"=>"2" },
    { a => 1, b => 2}
);
[download]

"From there to here, from here to there, funny things are everywhere." -- Dr. Seuss

[reply]
[d/l]