Re: removing duplicates from an array of hashes
by kcott (Archbishop) on Apr 17, 2014 at 04:26 UTC
|
my %seen;
@$ref = grep { ! $seen{$_->{id}}++ } @$ref;
My test is in the spoiler.
Script:
#!/usr/bin/env perl
use strict;
use warnings;
use Data::Dump;
my $ref = [ map { +{ id => $_ } } qw{a b c b} ];
dd $ref;
my %seen;
@$ref = grep { ! $seen{$_->{id}}++ } @$ref;
dd $ref;
Output:
[{ id => "a" }, { id => "b" }, { id => "c" }, { id => "b" }]
[{ id => "a" }, { id => "b" }, { id => "c" }]
| [reply] [d/l] [select] |
Re: removing duplicates from an array of hashes
by bigj (Monk) on Apr 17, 2014 at 04:53 UTC
|
As TMTWTDI, I'll add an inplace version:
my %seen;
for my $i (reverse (0 .. @$ref-1)) {
# replaces current item in array with last item
# and removes the last element if id already seen
$ref->[$i] = pop @$ref if $seen{$ref->[$i]->{id}}++;
}
Note: Ordering in the array might change
Update: inplace version that keeps the ordering
my %seen;
my $removed = 0;
for my $i (0 .. @$ref-1) {
my $item = $ref->[$i];
$seen{$item->{id}}++ ? $removed++ : ($ref->[$i-$removed] = $item);
}
splice @$ref,-$removed;
Greetings,
Janek Schleicher
| [reply] [d/l] [select] |
|
|
Just a matter of idle curiosity... I notice you use @$ref-1 instead of $#$ref for obtaining the max array index. Why do you prefer this? If $[ were ever changed (but don't do that!), @$ref-1 would yield an incorrect value for max index. (Although I suppose that one could argue that to be $[-safe, one ought always to iterate over $[ .. $#array rather than 0 .. $#array to safely visit the entire range of @array indices.)
Also, in your second code example
splice @$ref,-$removed;
might be written as the (presumably) marginally faster
$#$ref -= $removed;
Update: I should make it clear that this post was prompted mainly by the non-idiomatic nature of the @$ref-1 expression. Any "lack-of-safety" issue is only a kind of cayenne pepper icing on the cake.
| [reply] [d/l] [select] |
|
|
Reasoning is simple:
I didn't program for a long time (close to 10yrs),
I still were sure that $#array is the last item, but I was too lazy to look up, how to note it when it is a $arrayref, so I just took 0..@$arrayref-1 what works same here and for at least where I stopped, it was same idiomatic. Every programmer in the world understand 0..length(array)-1 in a heartbeat, only Perl cracks will understand $[..$#array.
I wasn't aware that I could just $#array -= $n to make the array smaller. To be honest, I'm not sure whether I'd like it. It works only in so rare cases where we basical want to pop @array, $n and don't care about what were the last items. splice is a clear idiom that we intend to remove entries out of an array and then specify which. If the special case were more often, o.k., but how often do we even use splice? IMHO that is even rare, most of the time we pop, shift or slice with @array[4..7,11..13] and so on, so no need to trick us selfs just to trick us. Code should be easy.
Anyway, pretty to cool to have learned some new tricks in Perl :-)
Greetings,
Janek Schleicher
| [reply] [d/l] [select] |
Re: removing duplicates from an array of hashes
by atcroft (Abbot) on Apr 17, 2014 at 04:02 UTC
|
perl -MData::Dumper -le '
my $ref = [];
$ref->[0] = { id => "a" };
$ref->[1] = { id => "b" };
$ref->[2] = { id => "c" };
$ref->[3] = { id => "b" };
print Data::Dumper->Dump( [ \$ref, ], [ qw( *ref ) ] );
my $temp;
my %seen;
while ( my $t = shift @{$ref} ) {
if (not defined $seen{$t->{id}}) {
push @{$temp}, $t;
$seen{$t->{id}}++;
}
}
print Data::Dumper->Dump( [ \$temp, ], [ qw( *temp ) ] );
'
Output:
Hope that helps. | [reply] [d/l] [select] |
Re: removing duplicates from an array of hashes
by NetWallah (Canon) on Apr 17, 2014 at 04:04 UTC
|
perl -MData::Dumper -E 'my $r=[map { id=>$_ }, ("a".."c","b")];
say Dumper $r;
my %h;
my @z= map {$h{$_->{id}}++ ?():$_ } @$r;
say Dumper \@z'
Output:
$VAR1 = [
{
'id' => 'a'
},
{
'id' => 'b'
},
{
'id' => 'c'
},
{
'id' => 'b'
}
];
$VAR1 = [
{
'id' => 'a'
},
{
'id' => 'b'
},
{
'id' => 'c'
}
];
What is the sound of Perl? Is it not the sound of a wall that people have stopped banging their heads against?
-Larry Wall, 1992
| [reply] [d/l] [select] |
Re: removing duplicates from an array of hashes
by NetWallah (Canon) on Apr 17, 2014 at 05:32 UTC
|
In-place TIMTOWTDI using 'delete', inspired by bigi (++):
perl -MData::Dumper -E 'my $r=[map { id=>$_ }, ("a".."c","b")];
say Dumper $r;
my %h;
$h{$r->[$_]{id}}++ and delete $r->[$_] for 0..$#$r;
say Dumper $r'
IMHO, kcott's grep (++) is the cleanest, and classic/canonical.
What is the sound of Perl? Is it not the sound of a wall that people have stopped banging their heads against?
-Larry Wall, 1992
| [reply] [d/l] |
|
|
Disadvantage is that delete works good on hashes, but "bad" (and deprecated) on arrays. It does not really delete an entry but just undefs it (with the exception when it is the last element(s), so it worked in the original example, but if you put e.g. to 'a'-ids at the start of the array, you'll see it). See also the Documention of delete.
Greetings,
Janek Schleicher
PS: I agree that the grep solution is the usual way. Was just interested to write an inplace algorithm, as sometimes that's usefull, too, when working with big data.
| [reply] |
|
|
perl -MData::Dumper -E 'my $r=[map { id=>$_ }, ("b","a".."c","b")];
say Dumper $r;
my %h;
$h{$r->[$_]{id}}++ and delete $r->[$_] for 0..$#$r;
say Dumper $r'
--- SECOND (Relevant) PART of OUTPUT---
$VAR1 = [
{
'id' => 'b'
},
{
'id' => 'a'
},
undef,
{
'id' => 'c'
}
];
What is the sound of Perl? Is it not the sound of a wall that people have stopped banging their heads against?
-Larry Wall, 1992
| [reply] [d/l] |
Re: removing duplicates from an array of hashes
by vinoth.ree (Monsignor) on Apr 17, 2014 at 07:26 UTC
|
use strict;
use warnings;
use Data::Dumper;
my $ref = [];
$ref->[0] = { id => 'a' };
$ref->[1] = { id => 'b' };
$ref->[2] = { id => 'c' };
$ref->[3] = { id => 'b' };
#But using temp hash
my $temp_hash={};
my @Uniqe_Array_Of_Hash = grep { $_ && ++$temp_hash->{$_->{id}}< 2 }
+ @$ref;
print Dumper \@Uniqe_Array_Of_Hash;
| [reply] [d/l] |
Re: removing duplicates from an array of hashes
by hdb (Monsignor) on Apr 17, 2014 at 10:38 UTC
|
Using an anonymous hash, not maintaining the original order:
@$ref = values %{{ map { $_->{'id'} => $_ } @$ref }};
| [reply] [d/l] |
|
|
You perlmonks are seriously amazing! Can't thank you enough!
| [reply] |