Re: Identifying duplicates in array or hash based on a subset of data

i guess there are many ways to do it, but i was thinking of create a hash using the type and pos as key:

my $hash = {};
foreach my $line ( <DATA> )
{
    my ( $id, $type, $pos ) = split /\s+/, $line;
    $hash->{ $id } = { id   => $id,
                       type => $type,
                       pos  => $pos,
                     };
}

my $dup_hash = {};
foreach my $id ( keys %{ $hash } )
{
    my $type_pos = $hash->{ $id }{type} . '_' . $hash->{ $id }{pos};
    $dup_hash->{ $type_pos }{count}++;
    $dup_hash->{ $type_pos }{id} = $id;
}
[download]

You can use the $dup_hash to check for duplicates etc.

$dup hash = {
  '1_10' => {
              'count' => 1,
              'id' => '1'
            },
  '1_11' => {
              'count' => 2,
              'id' => '2'
            },
  '1_15' => {
              'count' => 1,
              'id' => '4'
            },
  '2_5' => {
             'count' => 2,
             'id' => '5'
           },
  '2_7' => {
             'count' => 1,
             'id' => '7'
           }
}
[download]

Comment on Re: Identifying duplicates in array or hash based on a subset of data Select or Download Code