rahulme81 has asked for the wisdom of the Perl Monks concerning the following question:

Hi, Below is the output of hash. I get multipe values for my keys and values appeared more than one, I need to have only unique values for a key.

$HASH1 = { 'Alabama' => ['Andalusia', 'Anniston', 'Clanton', 'Eufaula', 'Auburn', + 'Bessemer', 'Eufaula', 'Auburn', 'Bessemer' ], 'California' => ['Barstow','Barstow'], 'Georgia' => ['Darien'], 'New York' => ['Amsterdam','Coney Island','Coney Island','Beacon','Bec +on'] };

for e.g. for key "California" I get values 'Barstow','Barstow' (same value two times), I need to retain the only one value 'Barstow'

what is a good way to check the existence for a key that has multiple values? and keep the value unique so my output looks

$HASH1 = { 'Alabama' => ['Andalusia', 'Anniston', 'Clanton', 'Eufaula', 'Auburn', + 'Bessemer'], 'California' => ['Barstow'], 'Georgia' => ['Darien'], 'New York' => ['Amsterdam','Coney Island','Becon'] };

Thanks

Replies are listed 'Best First'.
Re: Hash Multiple values for a key-Filtering unique values for a key in hash
by AnomalousMonk (Archbishop) on May 23, 2017 at 03:30 UTC

    One way:

    c:\@Work\Perl\monks>perl -wMstrict -le "use List::MoreUtils qw(uniq); ;; use Data::Dump qw(dd); ;; my $hashref = { 'Alabama' => [ qw(Andalusia Anniston Clanton Eufaula Auburn Bessemer Eufaula Auburn Bessemer) ], 'California' => [ qw(Barstow Barstow) ], 'Georgia' => [ qw(Darien) ], 'New York' => [ 'Coney Island', qw(Amsterdam Beacon Becon), 'Coney Island', ], }; ;; make_uniq($hashref); dd $hashref; ;; ;; sub make_uniq { my ($hr) = @_; ;; $_ = [ uniq @$_ ] for values %$hr; } " { Alabama => [ "Andalusia", "Anniston", "Clanton", "Eufaula", "Auburn", "Bessemer", ], California => ["Barstow"], Georgia => ["Darien"], "New York" => ["Coney Island", "Amsterdam", "Beacon", "Becon"], }

    Update: See List::MoreUtils::uniq()


    Give a man a fish:  <%-{-{-{-<

      Thanks. How can i put this in my script ? is like below

      make_uniq(\%hashref); dd $hashref; sub make_uniq { my ($hr) = @_; $_ = [ uniq @$_ ] for values %$hr; }
        How can i put this in my script ?

        If your data is held in the form of a hash (not a hash reference), e.g.

        my %hash = ( 'Alabama' => [ qw(Andalusia Anniston Clanton Eufaula Auburn Bessemer Eufaula Auburn Bessemer) ], 'California' => [ qw(Barstow Barstow) ], 'Georgia' => [ qw(Darien) ], 'New York' => [ 'Coney Island', qw(Amsterdam Beacon Becon), 'Coney Island', ], );
        the following works (tested):
        make_uniq(\%hash); dd \%hash;


        Give a man a fish:  <%-{-{-{-<

Re: Hash Multiple values for a key-Filtering unique values for a key in hash
by kcott (Archbishop) on May 23, 2017 at 05:59 UTC

    G'day rahulme81,

    That's a FAQ: How can I remove duplicate elements from a list or array?.

    To keep the values unique for a specific key, without removing duplicates from the values of other keys, you'll need to clear the %seen hash for each arrayref value. Here's how you might implement that:

    #!/usr/bin/env perl use strict; use warnings; use Data::Dump; my $HASH1 = { 'Alabama' => ['Andalusia', 'Anniston', 'Clanton', 'Eufaula', 'Auburn', + 'Bessemer', 'Eufaula', 'Auburn', 'Bessemer' ], 'California' => ['Barstow','Barstow'], 'Georgia' => ['Darien'], 'New York' => ['Amsterdam','Coney Island','Coney Island','Beacon','Bec +on'] }; for (values %$HASH1) { my %seen; $_ = [ grep { ! $seen{$_}++ } @$_ ]; } dd $HASH1;

    Output:

    { "Alabama" => [ "Andalusia", "Anniston", "Clanton", "Eufaula", "Auburn", "Bessemer", ], "California" => ["Barstow"], "Georgia" => ["Darien"], "New York" => ["Amsterdam", "Coney Island", "Beacon", "Becon"], }

    Note that the input shown in your OP has the two unique values 'Beacon' and 'Becon', but your expected output only has 'Becon': I suspect a typo. Beyond that, my actual output matches your expected output.

    See also: values.

    — Ken

Re: Hash Multiple values for a key-Filtering unique values for a key in hash
by Athanasius (Archbishop) on May 23, 2017 at 03:37 UTC

    Hello rahulme81,

    Another approach is to store each set of values in its own hash, rather than in an array:

    use strict; use warnings; use Data::Dump; my $HASH1 = { 'Alabama' => ['Andalusia', 'Anniston', 'Clanton', 'Eufaula', 'A +uburn', 'Bessemer', 'Eufaula', 'Auburn', 'Bessemer' ], 'California' => ['Barstow', 'Barstow'], 'Georgia' => ['Darien'], 'New York' => ['Amsterdam', 'Coney Island', 'Coney Island', 'Bea +con', 'Becon'] }; print "Values stored in anonymous arrays:\n"; dd $HASH1; my $HASH2; for my $key (keys %$HASH1) { ++$HASH2->{$key}{$_} for @{ $HASH1->{$key} }; } print "\nValues stored in anonymous hashes:\n"; dd $HASH2;

    Output:

    Hope that helps,

    Athanasius <°(((><contra mundum Iustus alius egestas vitae, eros Piratica,

Re: Hash Multiple values for a key-Filtering unique values for a key in hash
by shmem (Chancellor) on May 23, 2017 at 06:16 UTC
    I get multipe values for my keys and values appeared more than one

    Each hash key only holds one value. In this case, the value is a reference to an anonymous array. So your question really is "How can I remove duplicate elements from a list or array?"

    This is a FAQ. You'ld process each anonymous array and remove duplicates, or create a new anonymous array holding unique items and assign that to the value slot of each hash entry.

    If you don't want to reassign to the value slot - because, say, the reference is stored elsewhere too, and you don't want that link to be destroyed - you could use splice to edit the anonymous arrays in-place, like this:

    $HASH1 = { 'Alabama' => [ 'Andalusia', 'Anniston', 'Clanton', 'Eufaula', 'Auburn', 'Bessemer', 'Eufaula', 'Auburn', 'Bessemer', ], 'California' => ['Barstow','Barstow'], 'Georgia' => ['Darien'], 'New York' => [ 'Amsterdam','Coney Island','Coney Island', 'Becon','Becon', ], }; for my $key ( keys %$HASH1 ) { my $arrayref = $HASH1->{$key}; my %seen; # empty at each iteration $seen{$_}++ for @$arrayref; # process array from the end towards beginning for my $index ( reverse 0 .. $#$arrayref ) { if ( $seen{$arrayref->[$index]} > 1 ) { $seen{$arrayref->[$index]}--; $removed = splice @$arrayref, $index, 1; print "key $key: removed '$removed'\n"; } } } dd $HASH1; __END__ key Alabama: removed 'Bessemer' key Alabama: removed 'Auburn' key Alabama: removed 'Eufaula' key New York: removed 'Becon' key New York: removed 'Coney Island' key California: removed 'Barstow' { "Alabama" => [ "Andalusia", "Anniston", "Clanton", "Eufaula", "Auburn", "Bessemer", ], "California" => ["Barstow"], "Georgia" => ["Darien"], "New York" => ["Amsterdam", "Coney Island", "Becon"], }

    Note that in the OP, in the anonymous array for the key New York you have the entries Becon and Beacon which are different - a typo, I guess.

    perl -le'print map{pack c,($-++?1:13)+ord}split//,ESEL'
      Thank you so much everyone for guidance and suggestions.
Re: Hash Multiple values for a key-Filtering unique values for a key in hash
by BillKSmith (Monsignor) on May 23, 2017 at 13:33 UTC
    If you have the choice, it is easier to avoid storing the duplicates than it is to remove them later.
    use strict; use warnings; use Data::Dumper; my $HASH1 = { 'California' => [ 'Barstow' ], }; my $new_value = 'Barstow'; if (!grep {/$new_value/} map( @{$_}, values %$HASH1)){ push @{$HASH1->{California}}, $new_value; } $new_value = 'other'; if (!grep {/$new_value/} map( @{$_}, values %$HASH1)){ push @{$HASH1->{California}}, $new_value; } print Dumper($HASH1);
    Note that map allows grep to search the entire hash, as required by your later question.
    Bill
Re: Hash Multiple values for a key-Filtering unique values for a key in hash
by rahulme81 (Sexton) on May 23, 2017 at 08:36 UTC

    One more thing here, Let's say I get key/values as below in my hash

    Where a certain value appears twice for different keys, for e.g. 'Eufaula' is a value for key 'Alabama' & 'California'

    Likewise, 'Beacon' appears for both 'New York' & 'Utah'

    Can i delete this for one key only (no particular where i need to retain this value for which key, just need once in any key)

    my $HASH1 = { 'Alabama' => ['Andalusia', 'Anniston', 'Clanton', 'Eufaula', 'Aubur +n'], 'California' => ['Barstow','Eufaula'], 'New York' => ['Amsterdam','Coney Island','Beacon'], 'Utah' => ['Beacon','Layton'] };

    so my result hash should look like

    my $HASH1 = { 'Alabama' => ['Andalusia', 'Anniston', 'Clanton', 'Eufaula', 'Aubur +n'], 'California' => ['Barstow'], 'New York' => ['Amsterdam','Coney Island','Beacon'], 'Utah' => ['Layton'] };
    Thanks.

      You could solve both problems in one go by temporarily inverting your hash. So instead of a (state => cities) hash, you have a (city => state) hash. Then you can invert it back.

      use strict; use warnings; use Data::Dumper; my %hash = ( 'Alabama' => ['Andalusia','Anniston','Clanton','Eufaula','Auburn' +,'Auburn','Auburn','Auburn'], 'California' => ['Barstow','Eufaula'], 'New York' => ['Amsterdam','Coney Island','Beacon','Beacon','Beaco +n','Beacon','Beacon'], 'Utah' => ['Beacon','Layton'], ); # Invert the hash. my %inverted = map { my $state = $_; map { $_ => $state } @{$hash{$state}} } sort keys %hash; print Dumper \%inverted; # And back again: %hash = (); while (my ($city, $state) = each %inverted) { push @{ $hash{$state} ||= [] }, $city; } print Dumper \%hash;
      Can i delete this for one key only

      Of course. Simply amend your algorithm to maintain a separate list (which could be a hash for easy/fast lookups) of deleted values. Never delete a value which is already in that list.

      "Can i delete this for one key only (no particular where i need to retain this value for which key, just need once in any key)"

      I made reference to that in my original reply:

      "To keep the values unique for a specific key, without removing duplicates from the values of other keys, you'll need to clear the %seen hash for each arrayref value."

      In this code:

      for (values %$HASH1) { my %seen; $_ = [ grep { ! $seen{$_}++ } @$_ ]; }

      Just move the "my %seen;" line out of the loop:

      my %seen; for (values %$HASH1) { $_ = [ grep { ! $seen{$_}++ } @$_ ]; }

      Which, as the loop now only contains one statement, you can write more succinctly as:

      my %seen; $_ = [ grep { ! $seen{$_}++ } @$_ ] for values %$HASH1;

      I put that into the original code I gave you (along with your new HoA). Here's the output:

      { "Alabama" => ["Andalusia", "Anniston", "Clanton", "Eufaula", "Aub +urn"], "California" => ["Barstow"], "New York" => ["Amsterdam", "Coney Island", "Beacon"], "Utah" => ["Layton"], }

      Which exactly matches what you have for "so my result hash should look like".

      By the way, purely for academic interest, because I can't see that it gains you anything in this particular instance, if you're using Perl 5.10, or later, you could have reduced those two lines to this one:

      $_ = [ grep { state %seen; ! $seen{$_}++ } @$_ ] for values %$HASH1;

      I tested that: the output is identical. See state for more details.

      — Ken