Hash Multiple values for a key-Filtering unique values for a key in hash

rahulme81 has asked for the wisdom of the Perl Monks concerning the following question:

Replies are listed 'Best First'.

Re: Hash Multiple values for a key-Filtering unique values for a key in hash
by AnomalousMonk (Archbishop) on May 23, 2017 at 03:30 UTC

One way:

c:\@Work\Perl\monks>perl -wMstrict -le
"use List::MoreUtils qw(uniq);
 ;;
 use Data::Dump qw(dd);
 ;;
 my $hashref = {
   'Alabama' => [
     qw(Andalusia Anniston Clanton Eufaula Auburn Bessemer
       Eufaula Auburn Bessemer)
     ],
   'California' => [ qw(Barstow Barstow) ],
   'Georgia'    => [ qw(Darien) ],
   'New York'   => [
     'Coney Island', qw(Amsterdam Beacon Becon), 'Coney Island',
     ],
   };
 ;;
 make_uniq($hashref);
 dd $hashref;
 ;;
 ;;
 sub make_uniq {
   my ($hr) = @_;
   ;;
   $_ = [ uniq @$_ ] for values %$hr;
   }
"
{
  Alabama    => [
                  "Andalusia",
                  "Anniston",
                  "Clanton",
                  "Eufaula",
                  "Auburn",
                  "Bessemer",
                ],
  California => ["Barstow"],
  Georgia    => ["Darien"],
  "New York" => ["Coney Island", "Amsterdam", "Beacon", "Becon"],
}
[download]

Update: See List::MoreUtils::uniq()

Give a man a fish: <%-{-{-{-<

[reply]
[d/l]
[select]

Re^2: Hash Multiple values for a key-Filtering unique values for a key in hash

by rahulme81 (Sexton) on May 23, 2017 at 03:49 UTC

Thanks. How can i put this in my script ? is like below

make_uniq(\%hashref);
dd $hashref;

sub make_uniq {
my ($hr) = @_;
$_ = [ uniq @$_ ] for values %$hr;
}
[download]

[reply]
[d/l]

Re^3: Hash Multiple values for a key-Filtering unique values for a key in hash

by AnomalousMonk (Archbishop) on May 23, 2017 at 04:06 UTC

How can i put this in my script ?

If your data is held in the form of a hash (not a hash reference), e.g.

my %hash = (
  'Alabama' => [
    qw(Andalusia Anniston Clanton Eufaula Auburn Bessemer
      Eufaula Auburn Bessemer)
    ],
  'California' => [ qw(Barstow Barstow) ],
  'Georgia'    => [ qw(Darien) ],
  'New York'   => [
    'Coney Island', qw(Amsterdam Beacon Becon), 'Coney Island',
    ],
  );
[download]

make_uniq(\%hash);
dd \%hash;
[download]

Give a man a fish: <%-{-{-{-<

[reply]
[d/l]
[select]

Re: Hash Multiple values for a key-Filtering unique values for a key in hash
by kcott (Archbishop) on May 23, 2017 at 05:59 UTC

G'day rahulme81,

That's a FAQ: How can I remove duplicate elements from a list or array?.

To keep the values unique for a specific key, without removing duplicates from the values of other keys, you'll need to clear the %seen hash for each arrayref value. Here's how you might implement that:

#!/usr/bin/env perl

use strict;
use warnings;

use Data::Dump;

my $HASH1 = {
'Alabama' => ['Andalusia', 'Anniston', 'Clanton', 'Eufaula', 'Auburn',
+ 'Bessemer', 'Eufaula', 'Auburn', 'Bessemer' ],
'California' => ['Barstow','Barstow'],
'Georgia' => ['Darien'],
'New York' => ['Amsterdam','Coney Island','Coney Island','Beacon','Bec
+on']
};

for (values %$HASH1) {
    my %seen;
    $_ = [ grep { ! $seen{$_}++ } @$_ ];
}

dd $HASH1;
[download]

Output:

{
  "Alabama"    => [
                    "Andalusia",
                    "Anniston",
                    "Clanton",
                    "Eufaula",
                    "Auburn",
                    "Bessemer",
                  ],
  "California" => ["Barstow"],
  "Georgia"    => ["Darien"],
  "New York"   => ["Amsterdam", "Coney Island", "Beacon", "Becon"],
}
[download]

Note that the input shown in your OP has the two unique values 'Beacon' and 'Becon', but your expected output only has 'Becon': I suspect a typo. Beyond that, my actual output matches your expected output.

Read more... (2 kB)

Hope that helps,

Athanasius <°(((>< contra mundum Iustus alius egestas vitae, eros Piratica,

[reply]
[d/l]
[select]

Re: Hash Multiple values for a key-Filtering unique values for a key in hash
by shmem (Chancellor) on May 23, 2017 at 06:16 UTC

I get multipe values for my keys and values appeared more than one

Each hash key only holds one value. In this case, the value is a reference to an anonymous array. So your question really is "How can I remove duplicate elements from a list or array?"

This is a FAQ. You'ld process each anonymous array and remove duplicates, or create a new anonymous array holding unique items and assign that to the value slot of each hash entry.

If you don't want to reassign to the value slot - because, say, the reference is stored elsewhere too, and you don't want that link to be destroyed - you could use splice to edit the anonymous arrays in-place, like this:

$HASH1 = {
    'Alabama' => [
        'Andalusia', 'Anniston', 'Clanton', 'Eufaula',
       'Auburn', 'Bessemer', 'Eufaula', 'Auburn', 'Bessemer',
    ],
    'California' => ['Barstow','Barstow'],
    'Georgia' => ['Darien'],
    'New York' => [
        'Amsterdam','Coney Island','Coney Island',
        'Becon','Becon',
    ],
};
for my $key ( keys %$HASH1 ) {
    my $arrayref = $HASH1->{$key};

    my %seen; # empty at each iteration
    $seen{$_}++ for @$arrayref;

    # process array from the end towards beginning

    for my $index ( reverse 0 .. $#$arrayref ) {
        if ( $seen{$arrayref->[$index]} > 1 ) {
            $seen{$arrayref->[$index]}--; 
            $removed = splice @$arrayref, $index, 1;
            print "key $key: removed '$removed'\n";
        }
    }
}
dd $HASH1;
__END__
key Alabama: removed 'Bessemer'
key Alabama: removed 'Auburn'
key Alabama: removed 'Eufaula'
key New York: removed 'Becon'
key New York: removed 'Coney Island'
key California: removed 'Barstow'
{
  "Alabama"    => [
                    "Andalusia",
                    "Anniston",
                    "Clanton",
                    "Eufaula",
                    "Auburn",
                    "Bessemer",
                  ],
  "California" => ["Barstow"],
  "Georgia"    => ["Darien"],
  "New York"   => ["Amsterdam", "Coney Island", "Becon"],
}
[download]

Note that in the OP, in the anonymous array for the key New York you have the entries Becon and Beacon which are different - a typo, I guess.

perl -le'print map{pack c,($-++?1:13)+ord}split//,ESEL'

[reply]
[d/l]
[select]

Re^2: Hash Multiple values for a key-Filtering unique values for a key in hash

by rahulme81 (Sexton) on May 23, 2017 at 08:21 UTC

Thank you so much everyone for guidance and suggestions.

[reply]

Re: Hash Multiple values for a key-Filtering unique values for a key in hash
by BillKSmith (Monsignor) on May 23, 2017 at 13:33 UTC

use strict;
use warnings;
use Data::Dumper;
my $HASH1 = {
    'California' => [ 'Barstow' ],
};

my $new_value = 'Barstow';
if (!grep {/$new_value/} map( @{$_}, values %$HASH1)){
   push @{$HASH1->{California}}, $new_value;
} 

$new_value = 'other';
if (!grep {/$new_value/} map( @{$_}, values %$HASH1)){
   push @{$HASH1->{California}}, $new_value;
} 

print Dumper($HASH1);
[download]

Bill

[reply]
[d/l]

Re: Hash Multiple values for a key-Filtering unique values for a key in hash
by rahulme81 (Sexton) on May 23, 2017 at 08:36 UTC

One more thing here, Let's say I get key/values as below in my hash

Where a certain value appears twice for different keys, for e.g. 'Eufaula' is a value for key 'Alabama' & 'California'

Likewise, 'Beacon' appears for both 'New York' & 'Utah'

Can i delete this for one key only (no particular where i need to retain this value for which key, just need once in any key)

my $HASH1 = { 'Alabama' => ['Andalusia', 'Anniston', 'Clanton', 'Eufaula', 'Aubur +n'], 'California' => ['Barstow','Eufaula'], 'New York' => ['Amsterdam','Coney Island','Beacon'], 'Utah' => ['Beacon','Layton'] };
[download]

so my result hash should look like

my $HASH1 = { 'Alabama' => ['Andalusia', 'Anniston', 'Clanton', 'Eufaula', 'Aubur +n'], 'California' => ['Barstow'], 'New York' => ['Amsterdam','Coney Island','Beacon'], 'Utah' => ['Layton'] };
[download]
Thanks.

[reply]
[d/l]
[select]

Re^2: Hash Multiple values for a key-Filtering unique values for a key in hash

by tobyink (Canon) on May 23, 2017 at 10:36 UTC

You could solve both problems in one go by temporarily inverting your hash. So instead of a (state => cities) hash, you have a (city => state) hash. Then you can invert it back.

use strict;
use warnings;
use Data::Dumper;

my %hash = (
  'Alabama'    => ['Andalusia','Anniston','Clanton','Eufaula','Auburn'
+,'Auburn','Auburn','Auburn'],  
  'California' => ['Barstow','Eufaula'],
  'New York'   => ['Amsterdam','Coney Island','Beacon','Beacon','Beaco
+n','Beacon','Beacon'],
  'Utah'       => ['Beacon','Layton'],
);

# Invert the hash.
my %inverted = map {
    my $state = $_;
    map { $_ => $state } @{$hash{$state}}
} sort keys %hash;

print Dumper \%inverted;

# And back again:
%hash = ();
while (my ($city, $state) = each %inverted) {
    push @{ $hash{$state} ||= [] }, $city;
}

print Dumper \%hash;
[download]

[reply]
[d/l]
[select]

Re^2: Hash Multiple values for a key-Filtering unique values for a key in hash

by hippo (Archbishop) on May 23, 2017 at 08:55 UTC

Can i delete this for one key only

Of course. Simply amend your algorithm to maintain a separate list (which could be a hash for easy/fast lookups) of deleted values. Never delete a value which is already in that list.

[reply]

Re^2: Hash Multiple values for a key-Filtering unique values for a key in hash

by kcott (Archbishop) on May 24, 2017 at 05:44 UTC

"Can i delete this for one key only (no particular where i need to retain this value for which key, just need once in any key)"

I made reference to that in my original reply:

"To keep the values unique for a specific key, without removing duplicates from the values of other keys, you'll need to clear the %seen hash for each arrayref value."

In this code:

for (values %$HASH1) {
    my %seen;
    $_ = [ grep { ! $seen{$_}++ } @$_ ];
}
[download]

Just move the "my %seen;" line out of the loop:

my %seen;
for (values %$HASH1) {
    $_ = [ grep { ! $seen{$_}++ } @$_ ];
}
[download]

Which, as the loop now only contains one statement, you can write more succinctly as:

my %seen;
$_ = [ grep { ! $seen{$_}++ } @$_ ] for values %$HASH1;
[download]

I put that into the original code I gave you (along with your new HoA). Here's the output:

{
  "Alabama"    => ["Andalusia", "Anniston", "Clanton", "Eufaula", "Aub
+urn"],
  "California" => ["Barstow"],
  "New York"   => ["Amsterdam", "Coney Island", "Beacon"],
  "Utah"       => ["Layton"],
}
[download]

Which exactly matches what you have for "so my result hash should look like".

By the way, purely for academic interest, because I can't see that it gains you anything in this particular instance, if you're using Perl 5.10, or later, you could have reduced those two lines to this one:

$_ = [ grep { state %seen; ! $seen{$_}++ } @$_ ] for values %$HASH1;
[download]

I tested that: the output is identical. See state for more details.

— Ken

[reply]
[d/l]
[select]