In 2007, I read the following task on an online puzzle site (exercise 5 here).
Cut a 6 times 6 chessboard to 18 black and 18 white squares. Assemble them to a 6 times 6 board in such a way that each row and column has 3 black and 3 white squares, every row is different, and every column is different. How many boards can you get this way? The rows and columns of the board are numbered, so boards that differ in rotation or flipping count as different.
As discussed on the J wiki, this puzzle is quite easy to solve with brute-force.
In 2008, we decided to post a larger variant of this puzzle on the first round of the Challenge24 contest. The (contest webpage has the full problem set downloadable, this is problem F.) The task was basically the following.
A seven by seven cell table has to be filled in such a way that each row and each column must have exactly two cells with a 0, two cells with a 1, and three cells with a 2 in them; no two rows can be exactly the same; and no two columns can be exactly the same. Given the contents of the first two rows, compute the number of ways the rest of the table can be filled.
With current computers, it's almost impossible to brute force this task by enumerating all possible solutions.
There is one simple observation that cuts down quite some time though.
The permutation of the bottom 5 rows doesn't matter, so it's enough to iterate on those possibilities where the bottom rows are in lexicographic order, and multiply the result you get with 120 (which is 5 factorial).
However, brute forcing takes too much time even after knowing this trick. The first working perl program I wrote takes about twenty minutes on one input. Such a long running time would be unacceptable, as we want to give ten sample inputs, and the contestants would have five hours for the whole contest (reading the tasks, development, running the program, submitting the answer, all this for all eight tasks, with only three people in a team). I didn't despair though, for I already knew that that solution can be optimized a lot.
My final solution is more complicated and much faster: it runs on any one input in at most six seconds (that was back in 2008, computers are twice as fast now). That's fast enough, so we could post this problem. (In fact I was quite surprised on how much faster the program became.)
The remainder of this post explains how my solution works, and shows the code too. I've spoilered this in case you want to try to write a solution yourself.
First, let's discuss what technique this first program uses, for it's not a trivial implementation, and the final, fast program is a development of that.
I'm representing each row as a single integer fingerprint whose seven base 100 digits are 10, 1 or 0 depending on the contents of the cell. This way, you can tell if each column has exactly two 0, two 1 and three 2 values by adding the seven fingerprints representing the rows and see if you get 2222_222222_2222 as the result. I'd of course precompute all possible rows (all permutations of two 0, two 1, and three 2), and then the program only has to iterate on all combinations of five different of these, check if the columns are right and if there are no two identical columns. (We only need to check each combination, not each variation, per the note above.)
The next version of the program runs about four times faster. Instead of trying all possible rows for the last row, I compute the only numbers to make the distribution of each column right by subtracting the fingerprints of the six rows from 2222_222222_222, then check if that's a possible row and not identical to one of the rows above.
The most important optimization of the final solution is the following. I don't iterate on all combinations of four rows and compute the last one. Instead, I first do a fast loop where I iterate on all possible combinations of the last two rows, and store these in a hash keyed by the sum of the two fingerprints. (There can be more than one pairs of rows for each such key, but never more than a few of them.) Then, I iterate on all possible combinations of the middle three rows, compute the sum of fingerprints the last two rows need to give, and look up what possible last two rows can complete the table. I lose something by this method, for now I need to check whether the sixth row is lexicographically greater than the fifth row (the middle three rows and the last two rows themselves are already sorted lexicographically), and throw away some potential solutions where this is false. This, however, is nothing compared to how much speed I gain by having to loop only three levels deep instead of four levels like above.
The final optimization concerns how I check whether the table has two identical <s>rows</s> columns, for I always have to do this in all solutions above. For this, I make a secondary fingerprint for each row. This is an integer of 21 bits: each bit corresponds to a pair of column indices in the table, and each bit is set in a fingerprint if those two cells in the row have an equal value. These numbers have the property that the bitwise and of the fingerprints of the seven rows are zero if an only if there are no two identical columns in the table. With these values precomputed for all possible rows, we can do the check very fast.
Here follows the full code, with the inputs hardcoded in the DATA section. The code is a bit ugly, for I wrote it quick and dirty to simulate how a contestant could do it live in the contest.
#!perl use warnings; use strict; use IO::Handle; use Time::HiRes "time"; sub aeq { my($a, $b) = @_; @$a == @$b or die; for (0 .. @$a - 1) { $$a[$_] == $$b[$_] or return; } return 1; } my @poss; { my @pos = (0, 0, 0, 0, 0, 0, 0); while (1) { my $k = @pos; while (0 <= $k && 2 <= $pos[--$k]) { $pos[$k] = 0; } -1 == $k and last; $pos[$k]++; my @freq = (0)x3; $freq[$_]++ for @pos; aeq(\@freq, [2, 2, 3]) or next; push @poss, [@pos]; } } #warn 0+@poss; # 210 sub key { 0 + join "", map { ("10", "01", "00")[$_] } @_; } sub ckey { 7 == @_ or die; my $r = 0; for my $x (1 .. 6) { for my $y (0 .. $x - 1) { $r <<= 1; $r += $_[$x] == $_[$y]; } } 0 + $r; } my $INF = *DATA; my $num_inp = 0 + <$INF>; for (1 .. $num_inp) { my $begin_time = time; my @r0 = <$INF> =~ /[012]/g; my @r1 = <$INF> =~ /[012]/g; my $s2 = key(@r0) + key(@r1); my $cs2 = ckey(@r0) & ckey(@r1); my @posr = grep { !aeq(\@r0, $_) && !aeq(\@r1, $_) } @poss; @posr == @poss - 2 or do { print STDERR "i ", join(" ", @r0), "\n ", join(" ", @r1), "\n +"; print STDERR "w 0\n"; print "0\n"; flush STDOUT; next; }; my @posk = map { key @$_ } @posr; my @posck = map { ckey @$_ } @posr; # most a feladat olyan az $m2 < $m3 < $m4 < $m5 < $m6 indexek össz +eszámolása, amelyekre # $posk[$mi] összege plusz $s2 egyenlő 222222_2222_2222, # és $posck[$mi] bitwise and-je and $cs2 egyenlő & 0. my %sx_lookup; for my $m5 (3 .. @posr - 2) { for my $m6 ($m5 + 1 .. @posr - 1) { my $sx = 222222_2222_2222 - $s2 - $posk[$m5] - $posk[$m6]; my $csx = $cs2 & $posck[$m5] & $posck[$m6]; push @{$sx_lookup{0 + $sx}}, [$m5, $m6, $csx]; } } my($T) = (0); print STDERR "i ", join(" ", @r0), "\n ", join(" ", @r1), "\n"; for my $m2 (0 .. @posr - 5) { for my $m3 ($m2 + 1 .. @posr - 4) { for my $m4 ($m3 + 1 .. @posr - 3) { my $sx = $posk[$m2] + $posk[$m3] + $posk[$m4]; my $csb = $posck[$m2] & $posck[$m3] & $posck[$m4]; for my $rec (@{$sx_lookup{0 + $sx}}) { my($m5, $m6, $csx) = @$rec; $m4 < $m5 or next; my $cs = $csb & $csx; 0 == $cs or next; $T++; } }} print STDERR "."; } my $R = $T * (5*4*3*2*1); print $R, "\n"; flush STDOUT; print STDERR "\no $T $R\n"; printf STDERR "t %.2f sec\n", time - $begin_time; } __DATA__ 10 2 1 0 0 2 2 1 0 2 2 1 1 0 2 2 0 0 2 2 1 1 2 0 1 2 2 1 0 2 1 0 1 2 0 2 2 2 1 1 2 0 0 2 1 0 1 2 2 0 1 2 0 2 1 2 0 0 2 0 1 2 1 2 1 2 0 2 1 0 2 2 0 0 1 2 2 1 2 1 1 0 2 2 0 1 2 2 0 0 2 1 2 0 1 0 1 2 2 2 2 1 1 0 2 0 1 0 2 1 2 0 2 2 0 1 1 2 0 2 0 1 2 2 0 1 2 0 2 2 0 1 2 1 1 0 1 2 2 2 0
There is one more notable point that surprised me when I found out during the preparations. This is that there are only fourteen (14) significantly different input data you can give in this problem. (This didn't help the contestants though, for we gave ten significantly different inputs, and this is the kind of contest where the contestants run their programs on their own computers and submit only the result.) There are thirteen possible outputs, for two of the inputs have an equal output for a reason unknown to me.
You may want to compute these fourteen inputs (pairs of two lines): this is much easier than solving the original problem. Here are them in canonical form in case you want to check.
0 0 1 1 2 2 2 0 0 1 2 1 2 2 0 0 1 1 2 2 2 0 0 2 2 1 1 2 0 0 1 1 2 2 2 0 1 0 1 2 2 2 0 0 1 1 2 2 2 0 1 0 2 1 2 2 0 0 1 1 2 2 2 0 1 1 2 0 2 2 0 0 1 1 2 2 2 0 1 2 2 0 1 2 0 0 1 1 2 2 2 0 2 0 2 1 1 2 0 0 1 1 2 2 2 0 2 1 2 0 1 2 0 0 1 1 2 2 2 0 2 2 2 0 1 1 0 0 1 1 2 2 2 1 1 0 0 2 2 2 0 0 1 1 2 2 2 1 1 0 2 0 2 2 0 0 1 1 2 2 2 1 1 2 2 0 0 2 0 0 1 1 2 2 2 1 2 0 2 0 1 2 0 0 1 1 2 2 2 1 2 2 2 0 0 1
Update: changed broken link from contest's old homepage to new homepage. Update 2015-11-26: fix typo.
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re: Seven by seven farming puzzle
by salva (Canon) on Feb 04, 2010 at 12:17 UTC | |
by salva (Canon) on Feb 04, 2010 at 12:42 UTC | |
by ambrus (Abbot) on Feb 04, 2010 at 18:19 UTC | |
by ambrus (Abbot) on Feb 04, 2010 at 16:12 UTC | |
by ambrus (Abbot) on Feb 04, 2010 at 19:31 UTC | |
by salva (Canon) on Feb 07, 2010 at 11:15 UTC | |
by ambrus (Abbot) on Feb 14, 2010 at 16:46 UTC | |
by ambrus (Abbot) on Feb 14, 2010 at 18:06 UTC | |
by salva (Canon) on Feb 10, 2010 at 12:38 UTC |