I have to say, that even after you've posted three different explanations of your requirements, they are still far from clear. This produces what I think you are after:
#! perl -slw use strict; my %data; while( <DATA> ) { chomp; my @cols = split ' '; push @{ $data{ $cols[ 0 ] } }, pack 'C*', @cols[ 1 .. $#cols ]; } my @keys = sort{ $a <=> $b } keys %data; for my $i ( 0 ..$#keys ) { my $key1 = $keys[ $i ]; for my $keyset1 ( @{ $data{ $key1 } } ) { for my $key2 ( @keys[ $i+1 .. $#keys ] ) { for my $keyset2 ( @{ $data{ $key2 } } ) { my $mask = $keyset1 ^ $keyset2; next unless 1+ index $mask, chr( 0 ); printf "%3d : %3d : ", $key1, $key2; print join ', ', map{ substr( $mask, $_, 1 ) eq chr(0) ? $_+1 : () } 0 .. length( $mask )-1; } } } } __DATA__ 12 1 2 1 2 1 1 1 1 12 2 1 2 2 1 1 1 1 15 2 1 2 2 1 1 1 1 15 2 1 2 1 1 2 1 1 16 2 1 2 1 1 2 1 1 16 2 1 2 2 1 1 1 1 19 2 1 2 1 1 2 1 1 19 1 2 1 2 1 1 1 1 116 1 2 2 2 1 1 1 1 116 2 1 2 1 1 2 1 1
Gives:
c:\test>819256 12 : 15 : 4, 5, 6, 7, 8 12 : 15 : 5, 7, 8 12 : 16 : 5, 7, 8 12 : 16 : 4, 5, 6, 7, 8 12 : 19 : 5, 7, 8 12 : 19 : 1, 2, 3, 4, 5, 6, 7, 8 12 : 116 : 1, 2, 4, 5, 6, 7, 8 12 : 116 : 5, 7, 8 12 : 15 : 1, 2, 3, 4, 5, 6, 7, 8 12 : 15 : 1, 2, 3, 5, 7, 8 12 : 16 : 1, 2, 3, 5, 7, 8 12 : 16 : 1, 2, 3, 4, 5, 6, 7, 8 12 : 19 : 1, 2, 3, 5, 7, 8 12 : 19 : 4, 5, 6, 7, 8 12 : 116 : 3, 4, 5, 6, 7, 8 12 : 116 : 1, 2, 3, 5, 7, 8 15 : 16 : 1, 2, 3, 5, 7, 8 15 : 16 : 1, 2, 3, 4, 5, 6, 7, 8 15 : 19 : 1, 2, 3, 5, 7, 8 15 : 19 : 4, 5, 6, 7, 8 15 : 116 : 3, 4, 5, 6, 7, 8 15 : 116 : 1, 2, 3, 5, 7, 8 15 : 16 : 1, 2, 3, 4, 5, 6, 7, 8 15 : 16 : 1, 2, 3, 5, 7, 8 15 : 19 : 1, 2, 3, 4, 5, 6, 7, 8 15 : 19 : 5, 7, 8 15 : 116 : 3, 5, 7, 8 15 : 116 : 1, 2, 3, 4, 5, 6, 7, 8 16 : 19 : 1, 2, 3, 4, 5, 6, 7, 8 16 : 19 : 5, 7, 8 16 : 116 : 3, 5, 7, 8 16 : 116 : 1, 2, 3, 4, 5, 6, 7, 8 16 : 19 : 1, 2, 3, 5, 7, 8 16 : 19 : 4, 5, 6, 7, 8 16 : 116 : 3, 4, 5, 6, 7, 8 16 : 116 : 1, 2, 3, 5, 7, 8 19 : 116 : 3, 5, 7, 8 19 : 116 : 1, 2, 3, 4, 5, 6, 7, 8 19 : 116 : 1, 2, 4, 5, 6, 7, 8 19 : 116 : 5, 7, 8
Note: Somewhere you say "There are more than 1 million rows and columns.". If by that you mean (say) 100,000 rows of 10 columns or 10,000 rows x 100 columns; then the above code may work if you have 2 or 3 GB of RAM.
If you actually mean 1,000,000 rows X 1,000,000 columns, then you've got a real problem on your hands.
|
|---|