Re: delete duplicate hash value's

If your not choosy which duplicates get deleted (or rather which remain), you could try something like this.

Effectively a variation on the standard idiom used for weeding duplicates from an array, it creates another hash from the values of the second level hashes and uses that to determine if a duplicate has been seen yet.

my %seen;
for my $key (keys %students) {
    my $value_key = "@{[values %{$students{$key}}]}";
    if (exists $seen{$value_key}) {
         delete $students{$key};
    }
    else {
        $seen{$value_key}++;
    }
}
undef %seen;
[download]

Okay you lot, get your wings on the left, halos on the right. It's one size fits all, and "No!", you can't have a different color.
Pick up your cloud down the end and "Yes" if you get allocated a grey one they are a bit damp under foot, but someone has to get them.
Get used to the wings fast cos its an 8 hour day...unless the Govenor calls for a cyclone or hurricane, in which case 16 hour shifts are mandatory.
Just be grateful that you arrived just as the tornado season finished. Them buggers are real work.

Comment on Re: delete duplicate hash value's Download Code

Replies are listed 'Best First'.
Re^2: delete duplicate hash value's by Anonymous Monk on Jun 04, 2009 at 22:26 UTC
This solution in creating its $value_key assumes that the values of the hash ("values" function) will be returned in the same order with respect to the key names for each of the different student records. I don't think that Perl guarantees that behavior.	[reply]
Re^3: delete duplicate hash value's by BrowserUk (Patriarch) on Jun 05, 2009 at 05:17 UTC
I don't think that Perl guarantees that behavior. In general, hashes are iterated in bucket order, (and Perl does guarentee that values are returned in the same order as keys). Bucket order is a function of the hashing algorithm used which is fixed in Perl. Even with the hash randomisation fix for the "algorithm complexity attack" on Perl's hashes--which changes the initalisation values used by the hashing algorithm, the ordering is guarenteed to remain the same for any given run of the program which is all that is required of the code above. Essentially, if a hash contains the same keys, keys (and therefore values) will return them in the same order, regardless of the order they were inserted in. This can be demonstrated to be so: `#! perl -sw use 5.010; use strict; use List::Util qw[ shuffle ]; our $I \|\|= 1e6; sub genHash { my %hash; @hash{ shuffle 'a'..'d' } = 1 .. 4; return \%hash; } my $datum = join ' ', keys %{ genHash() }; warn $datum . "\n"; for my $i ( 1 .. $I ) { my $test = join ' ', keys %{ genHash() }; die "test failed after $i iters: $datum vs. $test\n" unless $datum eq $test; } say "Test passed for $I iterations" __END__ C:\test>junk2 c a b d Test passed for 1000000 iterations` [download] However, there is a caveat to this that obviously did not occur to me back in the day. Whilst the iteration order is independent of the insertion order, it is dependant upon the number of buckets in the hash. That is, if the hashes being compared contain the same keys--and have never contained any other keys--their iteration orders will be the same. But, if the hashes have different numbers of buckets; if for example, one of them has previously contained more keys some of which have subsequently been deleted; then their iteration orderings will differ: `@hashA{ 'a'..'d' } = 1..4;; @hashB{ 'a'..'j' } = 1 .. 10;; delete $hashB{ $_ } for 'e' .. 'j';; print scalar %hashA;; 4/8 print scalar %hashB;; 4/16 print join ' ', keys %hashA;; c a b d print join ' ', keys %hashB;; a d c b` [download] So, whilst this is unlikely to have affected the OPs application, for general application it would be better to sort the values by key order. As would using a join delimiter that is not going to occur in the values being concatenated: `my $value_key = join $;, @students{ sort keys %{ $students{ $key } } } +;` [download] Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error. "Science is about questioning the status quo. Questioning authority". In the absence of evidence, opinion is indistinguishable from prejudice. "Too many [] have been sedated by an oppressive environment of political correctness and risk aversion."	[reply] [d/l] [select]