in reply to Eliminate exact duplicates from array of hashes

A serialization approach:-

use 5.026; use warnings; use Data::Dumper; my @test_data = ( { Tag1 => q{1}, Tag2 => q{a} }, { Tag1 => q{1}, Tag2 => q{a} }, { Tag1 => q{1}, Tag2 => q{b} }, { Tag1 => q{1}, Tag2 => q{c} }, { Tag1 => q{1}, Tag2 => q{a} }, { Tag1 => q{2}, Tag2 => q{a} }, { Tag1 => q{2}, Tag2 => q{d} }, { Tag1 => q{2}, Tag2 => q{a} }, { Tag1 => q{3} }, { Tag1 => q{sun}, Tag2 => q{a} }, { Tag1 => q{sun}, Tag2 => q{a} }, ); my @unique = do { my %seen; map { $_->[ 1 ] } grep { ! $seen{ $_->[ 0 ] } ++ } map { my $rhItem = $_; [ ( join qq{\x00}, map { join qq{\x00}, $_, $rhItem->{ $_ } } sort keys %{ $rhItem } ), $rhItem ] } @test_data; }; print Data::Dumper ->new( [ \ @unique ], [ qw{ *unique } ] ) ->Sortkeys( 1 ) ->Dumpxs();

The output:-

@unique = ( { 'Tag1' => '1', 'Tag2' => 'a' }, { 'Tag1' => '1', 'Tag2' => 'b' }, { 'Tag1' => '1', 'Tag2' => 'c' }, { 'Tag1' => '2', 'Tag2' => 'a' }, { 'Tag1' => '2', 'Tag2' => 'd' }, { 'Tag1' => '3' }, { 'Tag1' => 'sun', 'Tag2' => 'a' } );

I hope this is of interest.

Cheers,

JohnGG

Replies are listed 'Best First'.
Re^2: Eliminate exact duplicates from array of hashes
by NetWallah (Canon) on Oct 10, 2019 at 01:31 UTC
    This algorithm suffers from the same issue that LanX pointed out.

    The following data defeats the de-dup:

    my $null = qq{\x00}; my @test_data = ( { "a${null}1${null}b"=>"2" }, { a => 1, b => 2} );
    My second program gives the correct results, although it too could be defeated by sufficiently crafted data.

                    "From there to here, from here to there, funny things are everywhere." -- Dr. Seuss