Kandankarunai has asked for the wisdom of the Perl Monks concerning the following question:

Dear Monks, How to delete duplicate values and sub array values in Hash Of array? For Example: Consider the following Hash of array
%input=(r=>[1..10],s=>[1..10],t=>[5..10],u=>[11..20]);
my output hash look like
%input=(r=>[1..10],u=>[11..20])
how to I do this?

Replies are listed 'Best First'.
Re: Remove the duplication in Hash values
by Corion (Patriarch) on May 20, 2011 at 07:05 UTC

    Why do you want to keep r and not s? Why do you want to throw away t?

      Because t is an sub array of s and r..

        So what should be kept for this input?

        %input=( r=>[1,2], s=>[2,3], t=>[1,3] );

        Also, please show us the code you have already written, and the input data, and your output, and please also explain how the output is not what you want.

        Update: Also, please do not repost your questions if you've already got answers. Please include a reference to your previous questions and also explain what problems you have with the previous answers. I consider it very rude to not mention that you already asked your question before.

        Eliminate Array of array duplication

        Sub array in Array of array

Re: Remove the duplication in Hash values
by baxy77bax (Deacon) on May 20, 2011 at 07:16 UTC
    Hi, well a clumsy way to do this would go something like this :
    use Data::Dumper; %input=(r=>[1..10],s=>[1..10],t=>[5..10],u=>[11..20]); grep{$hash{"$input{$_}[0]_$input{$_}[-1]"}++;$tmp{$_}=$input{$_} if $h +ash{"$input{$_}[0]_$input{$_}[-1]"}==1}keys %input; print Dumper(\%tmp);
    But if you play around with hashes you'll probably end up with some more clever solution :)

    cheers

    baxy

    Update: Ups , I just sow some extra conditions Corion pointed out so i don't think this will work for you but this should :)

    use Data::Dumper; %input=(r=>[1..10],s=>[1..10],t=>[5..10],u=>[11..20]); grep{$hash{$input{$_}[0]}->{$input{$_}[-1]}++;$hash_1{$input{$_}[0]}++ +;$hash_2{$input{$_}[-1]}++;$tmp{$_}=$input{$_} if ($hash{$input{$_}[0 +]}->{$input{$_}[-1]}==1 && $hash_1{$input{$_}[0]}==1 && $hash_2{$inpu +t{$_}[-1]}==1 )}keys %input; print Dumper(\%tmp); Output: $VAR1 = { 'u' => [ 11, 12, 13, 14, 15, 16, 17, 18, 19, 20 ], 'r' => [ 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 ] };
    (haven't tested it yet->possible bugs for you to remove:))

    cheers

      thank you for your valuable reply.What about t?
Re: Remove the duplication in Hash values
by Utilitarian (Vicar) on May 20, 2011 at 07:16 UTC
    Building on yesterdays question, you can use the same unique identifier for an array approach.
    my %names=(one => [1,2,3], two => [4,5,6], deux => [4,5,6], three => [7,8,9] ); my %values; for (keys %names){ delete $names{$_} if $values{ join ( ',', @{$names{$_}} ) }++ +; } for (keys %names){ print "$_ => ", join (", ", @{$names{$_}} ), "\n"; }
    print "Good ",qw(night morning afternoon evening)[(localtime)[2]/6]," fellow monks."
Re: Remove the duplication in Hash values
by wind (Priest) on May 20, 2011 at 17:23 UTC

    Building off my solution to an earlier problem: Re: Sub array in Array of array

    use strict; use warnings; my %input = ( r => [1..10], s => [1..10], t => [5..10], u => [11..20], ); for my $outer (sort keys %input) { next if ! $input{$outer}; my %seen = map {$_ => 1} @{$input{$outer}}; for my $inner (grep {$_ ne $outer} sort keys %input) { if (! grep {!$seen{$_}} @{$input{$inner}}) { delete $input{$inner}; } } } print "$_\n" for sort keys %input;
Re: Remove the duplication in Hash values
by anonymized user 468275 (Curate) on May 20, 2011 at 08:18 UTC
    If I understand correctly, you can't! Or at least, the array locations 0 and 0 thru 10 in the respective arrays in the output hash will have been autovivified giving existing addresses (test with exists()) populated however with undef(), although those above the upper limit you defined will not have come into existence.

    Update: You probably want just arrays (ie reconsider the storage model) of [0..9] and to iterate them afresh for each key iteration of the containing hash when reading and writing to the hash.

    Oops a cow flew by. Therefore a corrected response: There need to be clearer rules for the deletion - must the solution be contiguous? ie must it allow gaps if there is no other way to include a set that doesn't overlap another - what if two ranges overlap but are of the same size with one higher than the other - how is a winner selected between them?

    One world, one people

      No, the [1..10] in %input=(r=>[1..10],s=>[1..10] is an array reference with values 1 to 10, which is possible. You cannot have an array without index zero, but nothing in the OP says he wants to do that.