http://qs1969.pair.com?node_id=600519

in reply to decomposing binary matrices

At the risk of wasting your time a second time, I'm pretty sure that this method works. The method is based upon a half remembered technique for simplifying complex boolean conditions that I learnt at college and have never used since. You draw up a truth table for states and then swap whole columns or whole rows until the true values group together. I seem to recall that it didn't matter how many times you swapped things around so long as you always swapped whole columns or whole rows at a time

My POC below basically consists of sorting the matrix both horizontally and vertically, and then picking out the sets with a equal number of 'leading zeros'. The problem I had was tracking the positions the 1's came from. The proper way would be to use a matrix of anonymous arrays (or objects), each containing the row, column and boolean value. For simplicity, I used arrays of strings, replacing each '1' by the letter of it's starting column, and appending the row numbers (starting at 1) to each string. I then sort these strings, then tranforms the 'matrix' into another set of strings and sort again. AT this point you extract and retain the 'row numbers' item from the array.

What you end up with is the smallest group (least 1s/most 0s) sorted to the top. You count the number of leading zeros in group (min of the first two items) and select all the items with at least that number of leading zeros and place them into the first group. Then get the min leading zeros of the first two of the remaining items, and repeat. The final group will be the 'leftovers'.

The following shows the process step by step for the OP example.

```c:\test>600418-2.pl
This input

00CD01
A0CDE2
0B0D03
AB00E4
0BCD05
sorted

00CD01
0B0D03
0BCD05
A0CDE2
AB00E4

Tranformed looks like this

000AA
0BB0B
C0CC0
DDDD0
000EE
13524
sorted

000AA
000EE
0BB0B
13524
C0CC0
DDDD0
These are the sets where the letters denote the columns of the origina
+l matrix
(0 mean column not used in this set).
And the numbers above, the rows they came from.

13524  ## Column numbers from the original data. Trim leading zeros.
000AA  ## Read this as; Group one contains
000EE  ## columns A & E from rows 2 & 4.

13524  ## Leftovers group. Trim trailing columns used by earlier group
+s.
0BB0B  ## Columns B, C, & D from Rows 1, 3 & 5.
C0CC0
DDDD0

This works as is for the other examples you've supplied (see commented out data blocks).

I do remember from those far off college days that for some complex examples, it was possible for the grouping to 'wrap over the edges'. That is, if you draw the matrix on paper, and wrap it into a cylinder to bring the left and right edges together, a group could cross the boundary. The same was possible for the top & bottom edges.

I think that this would be catered for by repeating the sort/transform/sort and select process on the smaller groups, but I haven't taken it that far yet. Unless your matrices are huge, repeating the process to subdivide the smaller groups shouldn't be a problem.

```#! perl -slw
use strict;

sub xform {
my @out;
for my \$in ( @_ ) {
\$out[ \$_ ] .= substr \$in, \$_, 1  for 0 .. length( \$in ) -1;
}
return @out;
}

my @grid = ( "00CD01", "A0CDE2", "0B0D03", "AB00E4", "0BCD05", );
#my @grid=( "000DEF1", "000DEF2", "000DE03", "ABCDEF4", "ABCDEF5", "AB
+0DEF6", );
#my @grid=( "000DE01", "000D0F2", "0000EF3", "AB00004", "A0C0005", "AB
+00006", );
#my @grid=( "0B0D001", "000DE02", "0B00E03", "A0000F4", "A0C0005", "A0
+000F6", );

print "This input\n";
print "\t\$_" for @grid;

@grid = sort @grid;
print "sorted\n";
print "\t\$_" for @grid;

my @xformed = xform @grid;

print "\nTranformed looks like this\n";
print "\t\$_" for @xformed;

@xformed = sort @xformed;
print "sorted\n";
print "\t\$_" for @xformed;

## extra column label item.
my( \$label ) = grep /1/, @xformed;
@xformed = grep !/1/, @xformed;

my @subsets;
while( @xformed ) {
my \$mask = \$xformed[ 0 ] | \$xformed[ 1 ];
if( \$mask =~ m[(^0+)]  ) {
my \$count = length( \$1 ); ## Length of common zero prefix

push @{ \$subsets[ @subsets ] }, shift( @xformed ), shift( @xfo
+rmed );
while( @xformed
and \$xformed[ 0 ] =~ m[(^0+)]
and length( \$1 ) == \$count
) {
push @{ \$subsets[ -1 ] }, shift @xformed;
}
}
else {
push @subsets, [];
push @{ \$subsets[ -1 ] }, shift @xformed while @xformed;
}
}
print <<'EOS';

These are the sets where the letters denote the columns of the origina
+l matrix
(0 mean column not used in this set).
And the numbers above, the rows they came from.
EOS

print join "\n", \$label, @\$_, "\n" for @subsets;