Use a HoA with the key being Type & Pos padded with leading zeros and concatenated for easy sorting, the value being an anonymous array onto which the original lines are pushed. Then use grep and sort to get those keys with duplicate lines in ascending Pos within Type order as per the original data and print out.
johngg@shiraz:~/perl/Monks > perl -Mstrict -Mwarnings -E ' open my $inFH, q{<}, \ <<EOD or die $!; ID Type Pos 1 1 10 2 1 11 3 1 11 4 1 15 5 2 5 6 2 5 7 2 7 EOD my $hdrs = <$inFH>; my %tp; foreach ( <$inFH> ) { my $key = sprintf q{%09d:%09d}, ( split )[ 1, 2 ]; push @{ $tp{ $key } }, $_; } my @dupKeys = sort grep { scalar @{ $tp{ $_ } } > 1 } keys %tp; print @{ $tp{ $_ } } for @dupKeys;' 2 1 11 3 1 11 5 2 5 6 2 5 johngg@shiraz:~/perl/Monks >
I hope this is of interest.
Update: ++ Marshall - goodness only knows what I was thinking, foreach ( <$inFH> ) should of course be while ( <$inFH> ). That's what happens when you retire and hardly do any coding for months :-/
Cheers,
JohnGG
In reply to Re: Identifying duplicates in array or hash based on a subset of data
by johngg
in thread Identifying duplicates in array or hash based on a subset of data
by Anonymous Monk
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |