I have to say, that even after you've posted three different explanations of your requirements, they are still far from clear. This produces what I think you are after:
#! perl -slw
use strict;
my %data;
while( <DATA> ) {
chomp;
my @cols = split ' ';
push @{ $data{ $cols[ 0 ] } }, pack 'C*', @cols[ 1 .. $#cols ];
}
my @keys = sort{ $a <=> $b } keys %data;
for my $i ( 0 ..$#keys ) {
my $key1 = $keys[ $i ];
for my $keyset1 ( @{ $data{ $key1 } } ) {
for my $key2 ( @keys[ $i+1 .. $#keys ] ) {
for my $keyset2 ( @{ $data{ $key2 } } ) {
my $mask = $keyset1 ^ $keyset2;
next unless 1+ index $mask, chr( 0 );
printf "%3d : %3d : ", $key1, $key2;
print join ', ', map{
substr( $mask, $_, 1 ) eq chr(0) ? $_+1 : ()
} 0 .. length( $mask )-1;
}
}
}
}
__DATA__
12 1 2 1 2 1 1 1 1
12 2 1 2 2 1 1 1 1
15 2 1 2 2 1 1 1 1
15 2 1 2 1 1 2 1 1
16 2 1 2 1 1 2 1 1
16 2 1 2 2 1 1 1 1
19 2 1 2 1 1 2 1 1
19 1 2 1 2 1 1 1 1
116 1 2 2 2 1 1 1 1
116 2 1 2 1 1 2 1 1
Gives:
c:\test>819256
12 : 15 : 4, 5, 6, 7, 8
12 : 15 : 5, 7, 8
12 : 16 : 5, 7, 8
12 : 16 : 4, 5, 6, 7, 8
12 : 19 : 5, 7, 8
12 : 19 : 1, 2, 3, 4, 5, 6, 7, 8
12 : 116 : 1, 2, 4, 5, 6, 7, 8
12 : 116 : 5, 7, 8
12 : 15 : 1, 2, 3, 4, 5, 6, 7, 8
12 : 15 : 1, 2, 3, 5, 7, 8
12 : 16 : 1, 2, 3, 5, 7, 8
12 : 16 : 1, 2, 3, 4, 5, 6, 7, 8
12 : 19 : 1, 2, 3, 5, 7, 8
12 : 19 : 4, 5, 6, 7, 8
12 : 116 : 3, 4, 5, 6, 7, 8
12 : 116 : 1, 2, 3, 5, 7, 8
15 : 16 : 1, 2, 3, 5, 7, 8
15 : 16 : 1, 2, 3, 4, 5, 6, 7, 8
15 : 19 : 1, 2, 3, 5, 7, 8
15 : 19 : 4, 5, 6, 7, 8
15 : 116 : 3, 4, 5, 6, 7, 8
15 : 116 : 1, 2, 3, 5, 7, 8
15 : 16 : 1, 2, 3, 4, 5, 6, 7, 8
15 : 16 : 1, 2, 3, 5, 7, 8
15 : 19 : 1, 2, 3, 4, 5, 6, 7, 8
15 : 19 : 5, 7, 8
15 : 116 : 3, 5, 7, 8
15 : 116 : 1, 2, 3, 4, 5, 6, 7, 8
16 : 19 : 1, 2, 3, 4, 5, 6, 7, 8
16 : 19 : 5, 7, 8
16 : 116 : 3, 5, 7, 8
16 : 116 : 1, 2, 3, 4, 5, 6, 7, 8
16 : 19 : 1, 2, 3, 5, 7, 8
16 : 19 : 4, 5, 6, 7, 8
16 : 116 : 3, 4, 5, 6, 7, 8
16 : 116 : 1, 2, 3, 5, 7, 8
19 : 116 : 3, 5, 7, 8
19 : 116 : 1, 2, 3, 4, 5, 6, 7, 8
19 : 116 : 1, 2, 4, 5, 6, 7, 8
19 : 116 : 5, 7, 8
Note: Somewhere you say "There are more than 1 million rows and columns.". If by that you mean (say) 100,000 rows of 10 columns or 10,000 rows x 100 columns; then the above code may work if you have 2 or 3 GB of RAM.
If you actually mean 1,000,000 rows X 1,000,000 columns, then you've got a real problem on your hands.
Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.
|